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ABSTRACT : 



The mathematical theory of deterministic optimal control/differential 
games is applied to the study of some tactical allocation problems for 
combat described by Lanchester-type equations of warfare. A solution pro- 
cedure is devised for terminal control attrition games. H. K. Weiss* 
supporting weapon system game is solved and several extensions considered. 
A sequence of one-sided dynamic allocation problems is considered to study 
the dependence of optimal allocation policies upon model form. The solu- 
tion is developed for variable coefficient Lanchester-type equations when 
the ratio of attrition rates is constant. Several versions of Bellman*s 
continuous stochastic gold-mining problem are solved by the Pontryagin 
maximum principle, and their relationship to the attrition problems is 
discussed. A new dynamic kill potential is developed. Several problems 
from continuous review deterministic inventory theory are solved by the 
maximum principle. 



This task was supported by The Office of Naval Research. 



TABLE OF CONTENTS 



Section 

I. Introduction 

a. Optimal Control/Differential Games 

b. Dynamic Programming 

c. Tactical Allocation Problems 

II. Review of Pertinent Literature 

III. Some Tactical Allocation Problems 

a. The Allocation Problems 

b. Extensions of Lanchester-Type Models of Warfare 

c. Other Topics Not Included in this Report 

IV. Conclusions and Future Extensions 



Page 

4 

5 

6 
7 
9 

12 

12 

16 

17 

20 



Appendix 

A. The Isbell-Marlow Fire Programming Problem 

B. H. K. Weiss* Supporting Weapon System Game 

C. Some One-Sided Dynamic Allocation Problems 

D. Solution to Variable Coefficient Lanchester-Type Equations 117 

E. Connection with Bellman*s Stochastic Gold-Mining Problem 124 

F. A New Dynamic Kill Potential 

G. Applications to Deterministic Inventory Theory 



11 



4 



I. INTRODUCTION . 

This report documents research findings for the time period 30 
March 1970 to 19 June 1970 under support of NR 276-027. This report 
discusses applications of the theory of differential games to tactical 
allocation problems in the Lanchester theory of combat. We also discuss 
some extensions for Lanchester-type models of warfare and deterministic 
inventory theory. A companion report [76] discusses other research 
findings of the contract period with respect to surveillance-evasion 
problems of Naval warfare. 

The goal of this research is to determine the structure of optimal 
allocation policies for tactical situations describable by Lanchester- 
type equations of warfare. We hope to provide insight into such questions 
as 

(1) How should targets be selected? 

(2) Do target priorities change with time? 

(3) Do battle termination circumstances effect the optimal 
allocation policies? 

(4) How does the nature of the attrition process effect target 
selection? 

(5) What is the effect of ammunition constraints? 

(6) How does the uncertainty and confusion of combat effect the 
optimal selection rules? 

We develop our theory of target selection through the examination of a 
sequence of simplified models. These combat models are too simple to 
be taken literally but should be interpreted as indicating general 
principles to serve as hypotheses for subsequent computer simulation 
studies or field experimentation. 



5 



In warfare decisions must be made sequentially over a period of 
time, and the world is changed as a result of these decisions. The 
Lanchester theory of combat has been developed to describe such dynamic 
situations. Of even more interest to defense planners than how to 
describe combat, is how to optimize the dynamics of combat. Many times 
the static optimization techniques of linear and non-linear programming 
are not applicable, so new dynamic optimization techniques were developed 
in the 1950’ s. 

Actually, many such situations may be formulated as classical con- 
strained calculus of variations problems (technically referred to as 
the problems of Bolza, Lagrange and Mayer). Because of inequality 
constraints and non-negative variables in such problems, the classical 
methods are difficult to apply. Thus, dynamic programming [9] was 
originally developed as a computational technique for variational pro- 
blems, although its principles have proven to be of much wider applica- 
bility. This was also the impetus for the development of the maximum 
principle by the Soviet mathematician L. Pontryagin [68]. During this 
period military problems also rekindled interest in the game theory of 
J. von Neumann [78] with extensions being made to multi-move discrete 
games [9], [29] and differential games [50]. It seems appropriate to 
discuss these techniques briefly. 

a. Optimal Control/Differential Games . 

These techniques may be used to optimize systems whose behavior 
is described by a system of differential equations. The same basic 
concepts are referred to as optimal control when there is one controller 
and one criterion function and as a differential game with two controllers 
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and two criterion functions (which sum to zero). Recently the term 
"generalized control theory" has been coined [42], [43] for these dynamic 
optimization techniques. A common point of such models is that time 
is treated continuously. Major work has been done by L. Pontryagin 
and others in the USSR (see survey papers by [13], [71] and references 
in [8], [33]), and R. Bellman, L. Berkovitz, Y. C. Ho, and others in 
the US. R. Issacs has independently developed an extensive theory 
of differential games and has published a book containing numerous 
examples [50] . 

However, these techniques apply primarily to deterministic systems. 
Frequently numerical methods must be used when closed-form analytic 
solutions can’t be obtained. Dynamic programming was developed at RAND 
by R. Bellman and others [9], [10] for such cases. 

b . D ynamic Programming . 

Although numerical solution of variational problems was one of 
the initial reasons for the development of dynamic programming, this 
technique has proven to be of much wider applicability. It is a dual 
approach to Lagrange’s method of variations, which treats an extremal 
curve as a sequence of points and develops a differential equation to 
be satisfied at each such point. On the other hand, dynamic programming 
generates an optimal trajectory by considering the "direction of best 
return" working backwards from the problem’s end. It bears a close 
relationship to C. Caratheodory ’ s notion of a geodesic gradient, and 
this has rekindled interest in much classical work. 

Although we haven’t explicitly used dynamic programming in the 
present work, its underlying principle of optimality [9] continues to 
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apply when the assumption required by differential game theory of con- 
tinuous time no longer holds. Historically (see Chapter X of [9]), 
multi-move discrete games were considered before differential games, 
which are a limiting case. For future work in which it may be desirable 
to closer approximate the real world with less restrictive assumptions 
(for example, attrition rates which don’t lead to closed-form solutions 
of the corresponding differential equations), it may be necessary to 
employ numerical procedures, and we have given this consideration. 

c. Tactical Allocation Problems . 

We think that combining Lanchester-type models of warfare with 
the theory of differential games/dynamic programming has a great potential 
for providing insight into the optimization of the dynamics of combat 
continuing over a period of time with a choice of tactics available to 
both sides and subject to change with time. In the present work our 
goal is to determine the factors upon which the optimal allocation 
depends and also what this dependence is. We have considered the follow- 
ing aspects 

(1) combatant objectives (form of criterion function and valuation 
of surviving forces) , 

(2) termination conditions of conflict, 

(3) type of attrition process, 

(4) force strengths, 

(5) effect of resource constraints. 

Our conclusion is that any or all of the above factors may influence 
the structure of the optimal allocation policies depending upon the form 
of the model used. Judgment is required, then, to decide which type of 
model is most applicable for any specific problem. 
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Besides the study of problems of land combat, these models have 
numerous applications to problems of Naval warfare: 

(1) optimal allocation of Naval fire support, 

(2) allocation of Naval airpower between ground-support and 
strategic targets , 

(3) worth of Naval transport capability for troop build-up in 
combat zone. 

We envision these idealized models as being used to provide insight and 
to generate hypotheses to be tested in subsequent work under less re- 
strictive assumptions (such as computer Monte Carlo simulation or actual 
field experimentation). 

Our research approach has been to consider a sequence of models 
of increasing complexity. We have considered models for two types of 
choice situations 

(1) selection of target type, 

(2) regulation of firing rate. 

We have also found it necessary to develop several extensions to the 
theory of Lanchester-type models of warfare and also to differential 
game theory. 

In considering more and more complex models, we have started with 
one-sided models and done some work for the two-sided case. We have 
learned about the structure of optimal allocation policies by solving 
numerous specific problems. We have found that the application of 
existing theory to the prescribed duration battle is straightforward 
but that (even for the one-sided case) new approaches and concepts had 
to be developed for battles which terminate by the course of combat 
being steered to a prescribed state. In these terminal control problems 
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we have considered a "fight to the finish" for mathematical convenience, 
and our approach, of course, applies to any terminal control game. Our 
work shows that selection of the appropriate scenario (prescribed dura- 
tion or terminal control) may be an important decision in a defense 
planning study. We have also applied the existing theory of differential 
games to pursuit and evasion problems [76], We have found that there 
are numerous mathematical differences between pursuit-evasion and attri- 
tion differential games. 

These models consider the continual allocation of resources after 
the battle has started. We could consider models for the initiation 
and termination of conflict and also the allocation of resources across 
a broad front before the actual battle begins. Such considerations are 
beyond the scope of the present work. 

We have also looked for other areas of interest to defense planners 
for the application of the knowledge we have gained through our study 
of tactical allocation problems. Thus, we consider some models of 
deterministic, continuous-review inventory processes in Appendix G. 

II. REVIEW OF PERTINENT LITERATURE . 

We reviewed the literature in two subject areas: Lanchester theory 

of combat and differential games. We do not attempt an exhaustive review 
of the literature, since that was not the purpose of this research. 
However, we try to highlight some major works. 

One of the earliest attempts to establish a mathematical model 
of the dynamics of mass combat was by Lanchester [61] in 1916. He devel- 
oped several deterministic models that were a system of ordinary 
differential equations which related the strengths of opposing military 
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forces to length of combat. During World War II B. 0. Koopman extended 
Lanchester*s results and also suggested a reformulation of the problem 
in stochastic form [66]. After World War II the RAND Corporation carried 
on further studies whose results were summarized by Snow [72]. H. K. 

Weiss then at Aberdeen Proving Ground and others [7], [22], [28], [37], [38], 
[80], [81] have subsequently developed deterministic Lanchester models. 

R. Brown developed models for the stochastic analysis of combat [23]. 
The relationship between the above mentioned stochastic and deterministic 
Lanchester formulations was pointed out relatively early in their devel- 
opment (see [72], for example) but is probably best presented in a 
recent report by B. 0. Koopman [60]. Bonder [21] has done work on the 
estimation of the Lanchester attrition-rate coefficient (for weapon 
systems that adjust fire based on results of the previous round fired). 

A good review of the Lanchester theory of combat is by Dolansky [28], 
and this includes a comprehensive list of references through 1964. 

The study differential games was initiated by R. Isaccs at RAND 
in the early 1950’s [46], [47], [48], [49], but this work has not been 
available to a wide audience until quite recently [50]. His basic con- 
cept, "the tenet of transition," is a generalization of Bellman’s [9] ^ 

"principal of optimality" to a competitive environment, and this is used 
to develop necessary conditions for optimal strategies. A more recent 
and more rigorous development of these basic necessary conditions is by 
Berkovitz [12]. Since the excellent paper by Ho, Bryson and Baron [44] 
in 1965, there has been a literal explosion of papers on differential 
games but almost all deal exclusively with pursuit-evasion problems. 

Excellent survey papers which bear this out are by Simakova (Russian 
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literature) [71] and Berkovitz [13]. A more detailed review of differ- 
ential game literature for pursuit and evasion applications is to be 
found in a companion report [76]. At a fairly recent workshop on 
differential games it was noted that there have been no new significant 
examples [25] since the publication of Isaacs^ book. Other books which 
treat differential games are by Blaquiere et al. [16] (extension of 
their geometrical approach to optimal control) and Bryson and Ho [24] 
(Chapter 9). 

In 1964 Dolansky [28] noted that the Lanchester theory of combat 
was insufficiently developed in the area of target selection for combat 
between heterogeneous forces (optimal control/differential games). Even 
the two references cited by him, Weiss [82] and Isbell and Marlow [52], 
have been subsequently extended [74]. Since Dolansky ’s article, no 
further examples have been published in the literature except for the 
ones in Isaacs book [50]. 

One aspect that has impressed this author has been the diversity 
of approaches applied to the same problem by the researchers at RAND. 
Discrete and continuous models, deterministic and stochastic models are 
used in a complementary manner to help each other and provide insight. 

We note in this connection the discrete and continuous versions of the 
strategic bombing problem (Bellman’s stochastic gold-mining problem [9]) 
We also note that the War of Attrition and Attack of Isaacs is the con- 
tinuous version of other discrete sequential decision-making models of 
the strategic/tactical deployment of airpower studied at RAND [14], [15] 
[34]. 
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Differential game theory has also been used to study target 
selection in combat described by Lanchester-type equations at the 
University of Michigan. Results are summarized in a report [73], which 
references working papers for further details. We have not yet reviewed 
these working papers. However, it appears that this work does not 
consider the various possible model forms that we do in the present 
work and, hence, the dependence of optimal allocation policies on model 
form is not recognized. 

III. SOME TACTICAL ALLOCATION PROBLEMS . 

In this section we summarize results for the problems we have 
studied and explain why these problems were studied. A more detailed 
discussion on many points is to be found in the appendices. The current 
phase of this work has stressed extension of results in the literature. 
This has been by necessity both to familiarize ourselves with past 
work and to extend many partial or incomplete results. The present 
state of differential game/optimal control theory allows problems, 
which twenty years ago would be very difficult (if not impossible) to 
solve by classical variational methods, to be readily solved. 

First we review the various tactical allocation problems which 
we have studied, and then we discuss two extensions we have made to the 
Lanchester theory of combat. A section is included to summarize some 
work not included because of its incomplete nature in this report. 

a. The Allocation Problems . 

In Appendix A we derive a complete solution to the Isbell and 
Marlow [52] fire programming problem. This is a terminal control problem 
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(the battle terminates when the course of battle has reached some 
specified state) and such attrition games are not treated in Isaacs* 
book [50]. We first solved this problem to gain insight into a solution 
phenomenon of H. K. Weiss* supporting weapon system game [82]. In an 
optimal control problem one determines extremals and domains of con- 
trollability for each terminal state, but in a differential game further 
investigations are required to verify that one*s opponent can*t ’'block" 
entry to an unfavorable (losing) terminal state against one*s extremal 
strategy. It may be that he can steer the course of battle to an end 
favorable (winning) to him by use of other than his extremal strategy. 
This phenomenon has not occurred in any pursuit and evasion differential 
game in the literature. We discuss the structure of optimal target 
engagement policies for the Isbell-Marlow problem. Later (in Appendix 
C) we contrast the same combat model in scenarios of a prescribed dura- 
tion battle and a "fight to the finish." 

In Appendix B we apply the theory of differential games to H. K. 
Weiss* supporting weapon system game. This problem was originally 
solved by assuming a special form for the solution [82]. Subsequent 
work [58] has considered the simpler case of a prescribed duration 
engagement. We have found the existing framework of differential game 
theory inadequate for solving the supporting weapon system game and have 
consequently introduced the concept of a "blockable" terminal state 
which we have discussed briefly above. Such behavior does not occur 
in a one-sided problem. The book by Blaquiere et al [16] defines a 
similar concept of a "strongly playable strategy," but there are no 
concrete examples given to motivate this notion. 
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In the future we would propose to formalize the notion of a 
"blockable” terminal state as a contribution to the theory of differen- 
tial games. We also discuss several extensions of the original support- 
ing weapon system game in Appendix B, It seems appropriate to devise 
further extensions to study facets like: (a) target priorities for 

fire support systems, (b) when to engage enemy fire support system 
instead of fire support for other forces. We have examined some scenarios 
not included in this report. 

In Appendix C we examine a sequence of problems to study the 
dependence of optimal allocation policies on model form. We consider 
two types of choice problems: (1) target selection and (2) firing rate. 

In studying the problem of target selection we re-study the Isbell- 
Marlow fire programming problem to learn about the structure of best 
policies through a series of contrasts 

(a) prescribed duration versus terminal control battle, 

(b) two versus many target types, 

(c) square law versus linear law attrition. 

We discuss differences in the structure of optimal policies for all 
these cases. We also find out such things as that if one assigns a 
worth to targets in proportion to their kill rate against you, then 
there is never a switch in target priorities. We also are motivated 
to define the new dynamic kill potential of Appendix F, 

We also study the best firing rate in a sequence of models all 
having resource constraints. We are interested in ascertaining under 
what circumstances does one ''hold his fire,” We consider a simplified 
model for combat between two homogeneous forces in which one side has 
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an ammunition constraint that will be binding in a battle of prescribed 
duration and the attrition rates are constant. Under these circum- 
stances, the best policy is to fire at one’s maximum possible rate until 
all ammunition has been expended. We see that this model is not too 
realistic and are led to consider cases where the attrition rates vary 
with time or force separation. This leads to variable coefficient 
Lanchester-type equations and has been our impetus for seeking solution 
methods for such equations. We have, by necessity, had to extend the 
existing theory of Lanchester-type models, and we discuss this in 
another appendix (D) . We also consider several other scenarios for 
limited resources. 

In Appendix C we have also included a discussion of the usefulness 
of one-sided models for studying two-sided phenomena. We point out the 
close relationship between optimal control and differential game theory. 
Since the Hamiltonian is usually separable in the control variables, 
i.e., a function independent of 4) 4* a function independent of \p (for 

a practical example where this isn’t true see [ll])>we essentially have 
two "independent" optimal control problems (one a maximization and the 
other a minimization) and the optimal strategies are pure. We note that 
this is not true for many important models in game theory (Col. Blotto 
game, for example [29]). 

We also discuss the implications of the idealized models we have 
considered. Hence, we discuss optimal tactical allocation, intelligence, 
command and control systems, and human decision making. We have learned 
that optimal strategies are a function of model form, and there usually 
will be several possible forms available. 
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In Appendix E we develop the solution to the continuous version 
of Bellman’s stochastic gold-mining (strategic bombing) problem [9] by 
optimal control theory. We do so because the solution to this problem 
has a very similar structure to that for allocation of fire over targets 
undergoing linear law attrition. We consider two types of models: (1) 

maximum return for prescribed duration use and (2) maximum return for 
specified risk. The structures of the optimal allocation policies are 
slightly different in these two cases. Originally, Bellman used varia- 
tional methods and knowledge of discrete analogues to solve these problems. 
The new methods are easier to apply and provide more insight (for example, 
the distinction between the two problems considered above) . Our study 
of this problem and its similarity to other tactical allocation problems 
studied in Appendix C suggest that there may be a general structure 
underlying all such problems. We also are motivated to consider other 
formulations (for example, a force is only subject to attrition from 
targets that it engages) of tactical allocation problems with Lanchester- 
type models of warfare. 

b . Extensions of Lanchester-Type Models of Warfare . 

We have, by necessity, made two extensions to the Lanchester theory 
of combat: 

(1) solution to Lanches ter-type equations with variable coeffi- 
cients , 

(2) development of notion of a dynamic kill potential. 

In Appendix D we show how to solve Lanchester-type equations for combat 
between two homogeneous forces when the attrition rates are variable 
provided that their quotient is a constant. Solutions are developed 
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for either time or force separation as the independent variable. We 
also discuss the relationship of our work to that of others [20], [73]. 

In Appendix F we define the concept of a weapon system firepower 
potential. We obtained our motivation for this development from our 
study of tactical allocation problems using optimal control theory. 

Our approach provides a measure of the firepower capability of a weapon 
system giving consideration to the dynamics of combat. 

When one interprets the maximum principle and dual variables 
which one is using (or attempts derivations) , one sees that the rate 
of return for engaging a target (as measured by the rate of change of 
a terminal payoff for the scenario) changes during the course of battle. 
One is tempted to try to extend the notion of evolution of target worth 
to cases where there is no allocation problem. By use of the adjoint 
system to the Lanchester-type equations, one can do this. Our method 
may be used to study such facets of combat as the worth of mobility in 
battle, the effect of different range capabilities for weapon systems. 
This is the end of our guided tour of the appendices. 

c . Other Topics Not Included in This Report . 

It seems appropriate to note two other areas of work that for one 
reason or another have not been included in this report: (1) other 

tactical allocation fonnulations and (2) target coverage problems. We 
have done initial work on the formulation of other tactical allocation 
formulations and (2) target coverage problems. We have done initial 
work on the formulation of other tactical allocation situations 

(a) fire support of several ground units, 

(b) weapon system only subject to attrition when engaging a target 
type. 



We also did some work on coverage problems. We obtained a new 
result for the hit probability against a circular target when the dis- 
tribution of impact points follows an offset circular bivariate normal 
distribution. Although this type of problem has been extensively studied 
(in a recent survey article Eckler [31] gives 60 references; see also 
Grubbs' [36] brief survey), we have discovered a new representation for 
the hit probability, and this yields several useful approximations. 

Consider a circular target with radius a located at the center 
of an x-y rectangular coordinate system. Assume that the distribu- 
tion of impact points follows an offset circular bivariate normal distri- 
bution. We let 
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Also 

for R > a 

\it = + R2)/(2a^)} n|) 1 ^[^] . 

k=l 

The above formulas are readily proven through an intermediate result 
of Gilliland [35]. We may also express the above in closed form through 
the use of Lommel’s functions of two variables (see Watson [79] p. 537). 
for R < a 




and 

for R > a 



= -exp{-(a^ + R^)/(2a^) ^,i •^) 




where i = and U (w,z) is Lommel’s function of two variables 

n 

defined by 



n+2m 



- I (-1) (-) 

m=0 



20 



Unfortunately, there exist no tabulations for Lommel’s function of two 
imaginary arguments. Since several problems of physical significance 
also lead to this type of solution, the creation of such tables seems 
warranted. 



IV. CONCLUSIONS AND FUTURE EXTENSIONS . 

Here we summarize what we have done, state some generalizations, 
and suggest some possible future research. Further amplification of 
results and conclusions is to be found in the appendices. We have 
considered the optimization of dynamic systems using the theory of 
optimal control/differential games. Specifically, we have accomplished 
the following: 

(1) devised method for solving terminal control attrition games, 

(2) compared sequence of idealized scenarios to study dependence 
of optimal allocation policies on model form, 

(3) developed solution to Lanchester-type equations with variable 
coefficients under special circumstances, 

(4) developed a new dynamic kill potential, 

(5) generalized results in continuous review deterministic 
inventory theory (optimal inventory policies for linear 
production costs and effect of budget constraints). 

Based on our studies we conclude that 

(1) tactics of target selection are dependent on model form and 
may be sensitive to force strengths, target acquisition 
processes, attrition processes, and/or termination conditions 
of combat, 

(2) tactics for target selection depend upon "command efficiency," 

(3) for a continuous review deterministic inventory process, when 
production costs are linear, then the optimal inventory policy 
is essentially independent of the nature of holding costs 
except for sometimes operating at the minimum of the shortage/ 
holding cost curve. 
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We suggest the following as possible future work: 

(1) develop in a more mathematical fashion our theory of terminal 
control attrition games (The examples we have solved suggest 
several necessary extensions to the existing mathematical 
theory. ) , 

(2) study extensions of supporting weapon system game (We would 
examine optimal tactics for various battle termination con- 
ditions and attrition processes, )» 

(3) further study problem of best firing rate when there are 
ammunition constraints with either time-varying or range- 
varying attrition rates (This would extend models considered 
in Appendix C and would use our results developed in Appendix 
D.), 

(4) formulate allocation of forces before the inception of combat 
problem (It is of interest whether the optimal strategy is 
mixed for then the element of surprise becomes important in 
planning a successful attack.), 

(5) develop other models of tactical interest and study other 
extensions in the literature (We would continue to stress 

the study of the dependence of optimal tactics on model form.). 
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APPENDIX A. The Isbell-Marlow Fire Prograimning Problem. 

In this appendix we develop a complete solution to the Isbell 
and Marlow fire programming problem [52]. This is the simplest example 
of more general tactical allocation problems which are terminated by 
the system being steered to a specified terminal state. Subsequent 
work [82] which considered the work of Isbell and Marlow has been 
heuristic (not using the usual (today’s) necessary conditions [12]) 
possibly because of the incompleteness of this prior work. We origin- 
ally solved this (the Isbell-Marlow fire programming problem) in order 
to gain insight into the supporting weapon system game of H. Weiss [82]. 

In studying simplified models of dynamic tactical allocation pro- 
blems it is important to understand the dependence of the structure of 
optimal policies on model form. We have discovered in our researches 
that the optimal allocation policies may depend on the scenario chosen 
to study the problem. 

In this appendix we first state fire programming problem before 

« 

we outline our new solution procedure and indicate its extension to two- 
sided problems (differential games). Next we present the details of 
the solution, after which we discuss the structure of the optimal allo- 
cation policies. In view of the close connection [12], [41] between 
optimal control and differential games (Isaacs), the terminology of 
these two fields is used somewhat interchangeably. We begin by review- 
ing previous work briefly. 

An underdeveloped area [28] of the Lanchester theory of combat 
is target selection for combat among heterogeneous forces. This type 
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of problem has been studied by Isbell and Marlow, who considered both 
a truncated stochastic (Lanchester) process by game theoretic means [51] 
and a terminal control (one-sided) differential game [52]. An attrition 
differential game is an idealized combat situation described by Lanchester- 
type equations over a period of time with choices of tactics available 
to both sides and subject to change with time. Terminal control attri- 
tion games only end when the course of combat has been steered to a 
prescribed state. 

In developing a theory of target selection it is important to 
understand the dependence of allocation rules on the type of model chosen. 
Tactical allocation problems may be studied in two types of scenarios: 

(1) the prescribed duration battle and (2) the terminal control battle 
(a particular case of which is the "fight to the finish"). All the 
attrition examples in Isaacs’ book [50] are of the first type (his "War 
of Attrition and Attack" is the continuous version of the tactical air 
war game [14], [15], [34] studied at RAND). Only Isbell and Marlow [52] 
and Weiss [82] have studied the terminal control problem. Unfortunately, 
Isbell and Marlow did not obtain a complete solution to their problem. 

They could not determine when certain terminal states of combat were 
reached. Weiss studied a problem which may be considered to be a general- 
ization (two-sided version) of their problem. His solution procedure [82] 
was a heuristic one, not involving the usual (today’s) necessary condi- 
tions [12], possibly because the simpler problem which he referenced 
in his paper had not been completely solved. 
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a. Statement of the Problem . 

The situation considered by Isbell and Marlow [52] is the simplest 
problem of fire distribution: combat between an X- force at two force 

types (for example, riflemen and grenadiers) and a homogeneous Y-force 
(for example, riflemen only). This situation is shown diagrammatically 
below. 




It is the objective of the Y-force commander to maximize his survivors 
at the end of battle and minimize those of his opponent (considering 
the utilities assigned survivors). This is accomplished through his 
choice of the fraction of fire, (j) , directed at The battle 

terminates when one side or the other has been annihilated. 

Mathematically the problem may be stated as 



maximize ry(T) 
^(t) 

subject to: 



px^(T) - qx^(T) with T unspecified 
dx^ 

IT = - 



dXz 

-=-(!- <^)a2y 






^l’^2’^ ^ ^ 0 ^ ^ 1, 



where 
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p, q and r are utilities assigned to surviving forces, 

x^, and y are average force strengths, 

a^^, a^, and are constant attrition rates, 

(p is fraction of Y-fire directed at x^, 
and with terminal states defined by (1) ^ 

(2) y(T) = 0. 

The terminal surface of the "realistic” (one-sided) game is seen 
to consist of five parts: 

: x^(T) = 0, x^d) > 0, y(T) = 0, 

: x^(T) = before x^d) = 0, y(T) > 0, 

: x^d) = 0 after x^CT) = 0, y(T) > 0, 

: x^d) > 0, x^d) = 0, yd) = 0, 

: x^(T) > 0, x^d) > 0, yd) = 0. 

b . Solution Procedure and Extensions , 

Extremal paths (a path on which the necessary conditions for 
optimality are almost everywhere satisfied) may be obtained by routine 
application of Pontryagin’s maximum principle [68] (the original authors 
used equivalent conditions independently developed by Isaacs [48]). How- 
ever, in a terminal control problem we would like to know the domain of 
controllability [32] for each terminal state so that tactics are deter- 
mined in terms of the initial conditions of combat (and also possibly 
time). We define the domain of controllability for a given terminal 
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state to be that subset of the initial state space from which extremals 
lead to the terminal state. 

The following procedure has been used to solve the above problem: 

(a) extremal control is determined by maximizing the Hamiltonian; 
since the state variables (force strengths) are non-negative, the 
control depends, in many cases, only on relationships between the 
dual variables (marginal return from destroying target) , 

(b) from each separate terminal state, the time history of the dual 
variables is obtained by a backward integration of the adjoint 
system of differential equations; for a square law attrition 
process, the adjoint equations are independent of the state 
variables , 

(c) for each terminal state the domain of controllability is deter- 
mined by forward integration of the state equations using the 
time history of extremal control developed in (b) ; changes in 
control with time (existence of transition surface) may have to 
be considered in this step. 

It is noted that Isbell and Marlow [52] stopped at step (b) above. 

The complete solution to this problem is shown in Table AI. Details 
are presented below. A significant point to note is that the extremals 
are unique (non-overlapping of domains of controllability) so that the 
extremal control turns out to be the optimal control. This solution 
procedure may be easily extended to terminal control differential games 
(such as [82] in which the usual necessary conditions [12] were not 
applied). We do this in Appendix B. However, in two-sided problems 
this author has noted that domains of controllability may overlap and 



Table AI . Solution to Target Selection Problem: Fight to the Finish 
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<() (t) = 1 for 



Terminal State Optimal Control Conditions on Initial Values 
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there may be multiple extremals from a given point in the initial 
state space so that additional considerations must be employed. 

c . Some Comments . 

We note that the solution to a "fight to the finish" may depend 
upon the initial strengths of the combatants. This should be contrasted 
with the optimal allocation which is independent of force strength in 
the prescribed duration battle. We contrast the solution properties 
for these two cases in greater detail in Appendix C. 

The examining of this solution process provides valuable insight 
into the corresponding differential (supporting weapon system) game: 

(a) devising solution process, 

(b) understanding why no transition (switching) surface present 
in original problem studied by Weiss, 

(c) formulating a game which may possess a switching surface 
(optimal strategies change with time). 

It is noted that the supporting weapon system game may be viewed as an 

extension of this fire programming problem. The following aspects are 

also noteworthy of these two problems: 

(a) both represent simplest allocation problems of their type, 

(b) both are terminal control problems (as opposed to tactical 
war games studied by RAND researchers: [14], [15], [34] it 
is noted that the continuous version of these is Isaacs* 

[50] "war of attrition and attack"). 

It is noteworthy that if the objective function were modified to 
ry(T) - px^(T), then the entire solution to the new problem is the 
same as shown for case A in Table AI, except that the optimal control 
for entry to is not unique. Any control which leads to this state 

is optimal, since the payoff is always zero. Let us note that the 



deletion of from the objective function has caused nonuniqueness 

in the solution and absence of a transition surface under any circum- 
stances. We shall see that these observations are important for under- 
standing the solution of the original version of Weiss’ supporting 
system game. 

We note that the approach developed here for solving terminal 
control attrition games is different than that used to solve pursuit 
and evasion differential games. Some examples of the latter are worked 
out in detail in a companion report [76]. In Table All we summarize 
some major points of practical difference. 

d . Development of Solution . 

The solution is actually derived for a "reduced" game (that 
portion of battle during which Y is faced with a choice problem) . 

r ^ 

We illustrate here for extremals to C^. It suffices to trace extremals 
up to t^ when since (f) = 0 from then until the end of 

the game. The determination of the value, denoted by V (x^^x^yy) of 
the reduced game, which is needed to determine the values of the adjoint 
variables on the terminal surface, and part of the solution originally 
obtained by Isbell and Marlow will not be repeated here although we 
shall outline the general steps. 

The Hamiltonian is 

H(t,x,p,())) = -{p^(|)a^y + + p^ (b^x^+b^x^) } 



and the adjoint equations are 



Table All. Some Differences Between Terminal Control Attrition Games 
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with 



Pi = 



Po = 



P-1 = 



V3’ 









Pj^(t = tj^) = unspecified 



P2<t - tp 



Pj(t - tp 



8V 

8x 



-q/bp 

2 ‘^^ 2^2 ■ “2''“^ 



3V 



/bp A^k'2 - apy^ 



The extremal control is obtained from max H(t ,x,p,(f)) , and we 
also have that 



max H(t,x,p,(()) = 0. 
cf) 

Obtaining a solution to this problem is simplified by the following 
considerations. Let t = t^ - t and define 

v(t) = a2P2(T) - a^p^(x), 

then we have 



with 



dv 

dx 






v(t = 0) = = 0) - a^pj^(x = 0) , 



and where (up until the first shift of tactics) 
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P3('r) = P3 (t = 0) cosh{/4)a^b^ + x} 

(|)a p (t= 0) + (l-ij))a p (x=0) 

sinh{v^(}>a'it> + (l-(f))a„b x} 

/(Jia^b^ + 

The extremal control is determined by 

c()(t) = 0 for v(t) < 0, 

(|)(t) = 1 for v(t) > 0. 

It is easy to show that it is impossible for v(x) = 0 over any finite 
interval of time, and hence the possibility for any singular solution 
[53] to this problem is excluded. By the symmetry of this problem it 
suffices to assume that ^2^2 ^ ^1^1’ this case the domains of 

controllability for and are void. 

The major contribution of our present research is to show how to 
determine the domains of controllability. There are two cases to 
consider. 

Case (a) a^q ^ a^p 

This is the easier case and some of these results apply to the 
other case. The only time when the Y forces win is when terminal 
state : x^(t^) = = 0 and y(T) > 0 where T is the time 

of the end of the battle and t^^ < T is such that Xj^(tj^) =0 is 
entered. We determine the domain of controllability by combining the 
time history of the extremal control, the non-negativity requirements 
on the state variables, and the generalized square law 

Z2(t^) - Z2(t2> = {<()a^b^ + (l-<|))a2>b2>(y2(t^) ~ 
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where = const. in t^ ^ t ^ and Z(t) = b^x^(t) + b 2 X 2 (t). 

For the case at hand we have 

(y(t = t ^))2 = (y °)2 - + 2b2X°xp 

and 

-b^Cxp^ = a2{(y(T))2 - (y(t = t^)) 2 }. 

The desired condition is found by elimination of y(t = t^) between 
the above equations and requiring that y(T) > 0. 

It remains to distinguish between entry to and C^. On entry 

to , we have that x^(T) > 0, ^ 2 ^^^ ^ Y(T) = 0. The 

application of our "modified square law" yields, 

bi(Xi(X))2 + 2b2y°x^(T) = b^(xp^ + 2b2x“x° - a^(y°)2, 

whence our result by requiring that x^(T) > 0. 

Case (b) a^q > a^p 

The work of Isbell and Marlow has been extended by showing how 
to determine the domains of controllability when a switching surface 
is present in the solution. The conditions for entry to are as 

before. We must develop conditions to distinguish between entry to 
and and two subcases for entry to C^. 

is entered in those cases when the forces are destroyed 

before a switch in tactics is required. It is recalled that the latter 
condition, determined by backward integration of the adjoint differential 
equations from the terminal surface and the maximum principle, is 
independent of the initial conditions of the state variables. Entry to 
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is determined by the relationship between the proportion of total 
battle time (forward) to destroy and the time (backward) of the 

potential switch. The figure below shows the relationship between 
these times, where t = T - t, is the time (backward) of the switch, 

t = t^ is such that ^ time (forward) of the 

end of the battle. As shown would be entered. 




The condition for entry to is that t^ > where T = t^^ + t^, 

i.e. , the optimum length of T-time for engaging is less than the 

remaining time for X^ to destroy Y after Y has annihilated X^^ 

(battle starts with engagement of X^) . From the "modified square law," 



y( 






( y °)2 - 



(Xp2 _ 2 



o o 
^ 1 ^ 2 ' 



After annihilation of there is another battle of length 

remaining. Hence, for this portion where tj^ ^ t ^ T, 



/t>2 

y(t) = y(t = tj^)cosh/a 2 ^(t - t^^) - x°/ — sinh/a^b^ (t - t^) . 



Since y(t = T) = 0, we have (using that T - t^^ = t^) 
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tanh/a^b^ 



y(t=t^) 



X 



2 




From integration of the adjoint equations and the maximum principle, 
the T-time of the switch is given by, 

(qb^-pb^) 

aosh/S^ 'l ■ r ■<^qb7^2’V ■ 

The desired condition is determined by requiring that t^ > (as 

defined above) , use of the identities 



cosh ^x = ln[x H- /x^ - ij 
tanh-ly = j ln(J ^ ^ , 



and considerable algebraic manipulation. 

It finally remains to distinguish between the two cases of entry 
to C^. If i(^(t) = 0 for 0 ^ t ^ T, then 

(b^x° + 

y(t) = y° cosh/a b„ t - — sinh/a„b„ t. 



The boundary between the two cases is when y(T) = 0 for T = and 
hence , 

(y°)2[cosh/a^^ ^ { [cosh/a^b^ - 1} 



37 



where cosh/a^b^ is given as above. Noting that (}> = 0 for the 

entire battle when T < and re-arranging, we obtain the result 
shown in Table AI. 

e . Structure of Optimal Allocation Policies . 

For square law attrition it may be shown that the allocation of 
fraction of fire is always 0 or 1 (see previous section for remark) , 
and fire is concentrated on one target type. This is not surprising, 
since our model assumes complete and instantaneous information [13] and 
that fire may be immediately shifted to a new target once the old one 
has been destroyed [22], [81]. 

With reference to Table AI , the condition that ^ ^2^2 

be interpreted to mean that there is more long range return for Y to 
engage , i.e., more Y’s will survive if this is done. Hence, 

when Y wins, he always engages X^’s while they are available. The 
condition a^p < a^q means that at the end of battle there is greater 
payoff per unit time per Y soldier to engage X^ not considering X^’s 
greater attrition effect against Y (short term gain at end of battle). 

By the maximum principle and the well-known interpretation of the 
dual variables [12], Y always allocates his fire entirely to the 
target type yielding the greatest marginal return. However, marginal 
return evolves differently in winning or losing causes. When Y loses, 
he may switch from firing at X^ entirely to firing at X^ entirely 
before the X^ force has been annihilated. This happens when Y assigns 
utility to survivors of force type X^ in excess of their kill rate 
against Y as compared to force type X^ , and X^ is abundant enough 
not to be destroyed before the battle ends. 
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In this way, we see that tactics may depend on force levels. We 
also see that Y’s target priorities only switch with time in a losing 
case. This has occurred since a boundary condition at t = T on one 
of the dual variables is dependent upon values of the state variables 
by a transversality condition. It may be shown that the structure of 
optimal allocation policies is different for the prescribed duration 
battle . 

In Appendix F we show how such considerations as those discussed 
above may be developed into the concept of a dynamic kill potential. 
However, we do so from the standpoint of the adjoint system for a system 
of differential equations. (This approach may be used as an alternative 
to that of Pontryagin for the development of his maximum principle.) 
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APPENDIX B. H. K. Weiss’ Supporting Weapon System Game 

In this appendix we develop the solution to the supporting weapon 
system game of H. K. Weiss [82] by applying the theory of differential 
games • Previously, this problem had been solved under restrictive assump- 
tions by heuristic means. The solution procedure developed here is general 
and applies to any terminal control attrition game. A new solution concept 
is motivated by this development, and solution behavior not previously noted 
for differential games is encountered. 

Our researches on this and similar dynamic tactical allocation problems 
indicate that there are several significant differences in theory and re- 
sults between attrition and pursuit-evasion differential games. We have 
briefly considered such differences in Appendix A. However, much excellent 
research has been done on generalized control theory applicable to pursuit 
and evasion problems, and we envision the application of such results to 
tactical allocation problems as being fruitful future research. For example 
the concepts of stochastic control could be applied to a situation in which 
combatants select targets without knowing precisely what the results of 
firings will be. 

The model considered here is an idealization of a real combat situation 
Its value lies in the insight it provides into the relations between system 
parameters. It should not be expected to produce a numerical answer to a 
specific problem but rather to indicate general principles to serve as hy- 
potheses for subsequent computer simulation studies or field experimentation 
In this manner, the model considered here may be used to study the following 
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facets of supporting weapon systems: performance characteristics, alloca- 

tion rules, impact of intelligence and command and control factors on the 
preceding . 

There are two types of scenarios in which we may study idealizations 
of tactical allocation problems: (1) the prescribed duration battle and 

(2) the terminal control battle, i.e., the game only ends when the course 
of battle has been steered to a prescribed state. All the attrition prob- 
lems studied by Isaacs [50] are of the first type. It is noted that his 
War of Attrition and Attack is the continuous version of other such studies 
[14], [15], [34]. Only Isbell and Marlow [52] and Weiss have studied the 
terminal control problem. The former did not obtain a complete solution 
to their problem but we have in Appendix A and were motivated to the 
present development. Only by studying several types of models can we begin 
to understand the dependence of allocation rules on model form. 

In this appendix we consider what forms of such dynamic models are 
available before we review Weiss’ problem formulation. We then critique 
his previous approach before outlining our new solution procedure and pre- 
sentingdetails of solution development. We then discuss the structure of 
optimal allocation policies. We also discuss extensions of the model and 
a pitfall of model formulation before we contrast some facets of prescribed 
duration battles to fights to the finish. We finally mention a few implica- 
tions of the models we have considered. In view of the intimate relation- 
ship [12] , [41] between optimal control theory and differential games 
(Isaacs), we use their terminology somewhat interchangeably. 
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a. Forms of Model Available , 

It seems appropriate to discuss the factors affecting the optimal 
allocation policies. Different assumptions regarding these factors lead 
to models with different optimal allocation policies. The model for a 
tactical allocation problem involves three factors: 

(1) the payoff, 

(2) the description of combat, 

(3) the planning horizon. 

We will consider a terminal payoff with a linear objective function. 
The tactical allocation problems studies at RAND [14], [15], [34], [50] 
all involved an integral payoff. Further comment on the effect of inclu- 
sion of only one of the two force types in the payoff by Weiss [82] seems 
appropriate. What effect does this have on the optimal allocation? From 
the present work, it seems reasonable to conjecture that for two-on-two 
combat the optimal strategies for a side will be constant over time (except 
for the obvious change when a force under attack becomes exhausted) if the 
payoff only includes one force type. It is further conjectured that this 
is the reason (only the ’’men" of each side appearing in the payoff) that 
the optimal strategies in the reduced supporting weapon system game of 
H. K. Weiss are constant over time and that optimal strategies may vary 
over time when all force types are included in the payoff function. It 
will be seen that optimal strategies only change over time for the loser 
who engages the force type that does him the most damage in the early 
stages of the battle and the force included in the payoff on which he has 
the most effect in the latter stages. We conjecture that the winner ^s 
optimal strategy is always constant over time for *’fights to the finish." 
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For our description of the combat attrition process we may consider 
a generalized Lanchester linear law or a square law (although other mathe- 
matical descriptions have been noted as applicable to specific situations). 
For a square law attrition process the attrition rate is proportional to 
enemy strength, while for a linear law it is proportional to the product 
of both enemy and friendly force strengths. With rare exception ([75] or 
Isaacs* *'war of attrition and attack: second version” [50]), previously 
published work has considered only the square law model. In Appendix C 
we show that a square-law attrition process leads to a ”bang-bang" optimal 
control while the linear law leads to a singular solution (see p. 481 of 
[6]). The mathematical development is much more complex in the second 
case, but we have studied singular problems on numerous occasions (pursuit 
and evasion [76], inventory theory, the continuous version of Bellman’s 
stochastic gold-mining problem). 

It seems appropriate to briefly discuss the physical assumptions which 
underlie these idealizations of combat attrition. The square law arises 
under conditions which include that "each unit is informed about the loca- 
tion of the remaining opposing units so that when a target is destroyed, 
fire may be immediately shifted to a new target" as noted by Weiss [81]. 

It is noted that differential game theory itself assumes complete informa- 
tion (except that a player does not know the instantaneous strategy of the 
opposing player). The linear law arises when either target acquisition is 
subject to diminishing returns [22] or fire is not redirected towards sur- 
viving targets after attrition occurs [39], [70], [81]. 

In the present work a model is formulated for the simplest case of 
partial information : "area fire" is delivered by the supporting weapon 

system against the ground troops who use a constant area defense while the 
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perfect information assumption is retained on the state of the supporting 
weapon system. Again quoting Weiss [81] , we assume that the supporting 
weapon system units are informed about the general areas in which the 
opposing infantry units are located but are not informed about the conse- 
quences of their own fire. Thus, we see that we may account for some 
changes in the information set by modifying the description of combat. Un- 
fortunately, the mathematics of the resulting problem is much more complex 
than previously encountered, and a complete solution has not yet been ob- 
tained for this case. For this model of incomplete information, one in- 
troduces the concept of inferred information (players know more than they 
can observe directly) based on each player’s knowledge of the time history 
of his control variables and considers the resulting equations in this 
light . 

Another factor having a bearing on the optimal allocation policies 
is the length of the planning horizon (length of the battle). The follow- 
ing three alternative models are available: 

(1) battle of prescribed time duration, 

(2) battle of unspecified time duration, 

(3) battle until the extermination of one side. 

Our researches have subsequently yielded that case (2) is not a properly 
posed problem in the classical sense [27]. Models applying to the first 
instance have been extensively studied by RAND researchers [14], [15], 

[34], [50]. The present work (as an extension of the work of Isbell and 
Marlow and Weiss) will address the third case, ’’fights to the finish.” 

The mathematical details of solution and the structure of optimal policies 
are significantly different for these two cases. Games of 
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prescribed duration are mathematically simpler than "fights to the finish,* 
since the terminal surface consists of one **piece'* and many different 
portions do not have to be considered. Once the adjoint equations have 
been integrated backward from the terminal surface, the history of the 
extremal strategies Cand hence optimal strategies) becomes uniquely deter- 
mined unless a state variable goes to zero and a subgame is entered. On 
the other hand for a terminal control game, extremals to all the distrinct 
portions of the terminal surface must be considered. Entry to a portion 
of the terminal surface must be verified by both considerations "in the 
large** and forward integration of the state equations (after determination 
of extremal strategies). Many times the potential existence of a transi- 
tion (switching) surface turns out to be illusory, and the complete solu- 
tion may turn out to be radically different than was initially anticipated 

b . Problem as Formulated by Weiss 

The problem studied by Weiss [82] may be stated as how should the 
fire support systems of two heterogeneous forces (each consisting of 
ground forces and its fire support system) optimally engage the opposing 
combatant. The objective is for each side to minimize its losses in a 
conflict which terminates when the opposing side is annihilated. The 
ground forces (infantry) are assumed to have a negligible effect in pro- 
ducing casualties on each other. 

Using Weiss’ original notation the problem was finally reduced to 
the payoff: 



max min [y, (T) - y«(T)] , 

4>iP 



(Bl) 
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where T is the unspecified terminal time of the battle and ^ and 
are decision variables representing the fraction of ^air’ of ODD and EVEN 
which engages the opposing ’infantry’. The average strength of remaining 
forces are given by the state equations: 



yi = > 

72 = -<i>y3 > 

73 = -C1-4')74 . 

74 = -(1-4>)73 . 

with boundary conditions: 

yj^(t=0) = , yj^Ct=T) = 0 

y2(t=0) = y^ , 

O 

y3(t=0) = y^ , 

o 

74Ct=0) = y^ . 



(B2) 



(B3) 



where dy^/dt 

and 

y^, y^ = average strength of ’infantry’ of ODD and EVEN at time t, 

y^, y^ = average strength of ’air’ of ODD and EVEN at time t. 

It is noted that the y. are transformed variables which include attrition 

1 

rates. We will also denote terminal values as y. (t=T) = y. , in conson- 

^is 

ance with Weiss’ notation. It is finally noted that the terminal condition 
on y^ has been specified as a prelude to the development in a future 



section. 
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c . Critique of Previous Solution Procedure . 

We should bear in mind that Weiss’s excellent paper [82] (it con- 
tains much more than the mathematical solution of a differential game) 
was written over ten years ago. Writing many years before results 
were known beyond a small number of researchers, he did not employ the 
usual (today’s) necessary conditions [12]. The original solution 
technique in this pioneering effort used unsupported assumptions which, 
in general, are not true, although the correct answer was obtained to 
the particular problem posed. Weiss assumed that optimal strategies 
would be (a) either 0 or 1 and (b) constant over time and then 
determined the saddle point of the payoff function. It will be seen 
that rather laborious computations are required to establish the solu- 
tion form that Weiss assumed. 

Weiss’s pioneering effort is especially remarkable when one con- 
siders that Isaacs’s book [50] had not yet been written and only Isaacs’s 
early RAND memos (see in particular [48], [49]) were available. Also, 
Isbell and Marlow had failed to obtain a complete solution to a simpler 
(one-sided) terminal control problem. We note that Weiss’s problem 
(and also Isbell-Marlow fire programming problem) do not appear to be 
known to the control theorists [5], [13], [24], [71]. 

Weiss’s paper also contains an extension of the attrition model 
imbedded in an economic model of conflicting systems. It also contains 
a penetrating analysis of weapon system performance characteristics 
and concludes with a discussion of insight gained into the optimum 
design of real world weapon systems. 
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d. Solution Procedure . 

In this section we outline the solution procedure, introduce the 
concept of the "reduced game," illustrate the determination of extremal 
strategies, and discuss the concept of a "blockable" terminal state. 
Outline of Solution Procedure 

In a terminal control problem, we must determine the optimal strate- 
gies for each player in terms of the initial conditions of combat (and 
also possibly time). The solution procedure consists of two phases: 

(a) determine all extremal strategies and (b) determine optimal strate- 
gies from among the extremal strategies. By an extremal, we mean a path 
on which the necessary conditions [12] for optimality are almost every- 
where satisfied. 

We must consider each terminal state separately. For each terminal 
state, there will be one or more extremal paths leading to that state. 
Extremal paths may be determined by routine application of the well- 
known necessary conditions. For each extremal path to a terminal state 
there is a domain of controllability, which we define to be that subset 
of the initial state space from which a family of extremals leads to 
the terminal state. The solution procedure may be summarized as: 

(1) identify "attainable" terminal states, 

(2) determine "domain of controllability" in initial condition 
space corresponding to each extremal leading to every 
"attainable" terminal state, 

(3) partition the space of initial conditions into exhaustive 
and mutually exclusive sets, each of which is covered by 
the "domain(s) of controllability" of one, two, etc., of 
the extremals to terminal states, 

(4) 



the solution is uniquely determined at this point for regions 
covered by part of only one domain of controllability, 
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(5) delete from further consideration those portions of the 
domain of controllability of any terminal state which is 
”blockable*' from those initial points; again the solution 
is uniquely determined (extremal is optimal) for those 
regions reverting to step (4) , 

(6) if there is still more than one extremal to a given terminal 
state for a set of points in the initial condition space, 
compute the value of the game for each extremal; the final 
solution is determined by comparing these values* 

The concept of a "blockable" terminal state is discussed below* 
Concept of the ^*Reduced Game *' 

The battle is over when either y^ or y^ becomes zero* It is 
convenient to introduce the concept of the "reduced game*" Let us 
henceforth refer to the original problem as the "realistic game." In 
attrition games (especially "fights to the finish") the allocation 
problem may disappear before the terminal surface is reached* Let us 
refer to that part of the game for which the full allocation problem 
exists as the "reduced game," and we now consider the terminal surface 
of the reduced game* The value of the reduced game must be backcalculated 
from the value of the realistic game. To illustrate, the terminal sur- 
face for the above problem is defined by three terminal states: (a) 

y^(T) = 0, (b) y^(T) = 0, and (c) y^(T) = 0 and y^(T) = 0* The 

terminal surface of the reduced game is seen to consist of five portions 
and these are shown in Table BI* 

It will be seen that the extremal strategies to each of these 
requires a different development. The payoff on is (“y 2 (T)), 

since ODD has lost all his infantry at the terminal surface of the 
realistic game. It may be that a portion of the terminal surface is 
not attainable from any point in the initial state space, and this is 
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(1) a 



(2) a 



(3) a 



Portions of Terminal Surface 



A EVEN wins 
B EVEN wins 
C ODD wins 
D ODD wins 
E DRAW 



yj^(T) = 0 
= 0 

Y2(T) = 0 
y^CT) = 0 



Extremals leading to A 



Extremals leading to B 



L' 



()) = 1 



..j; = 1 



for 0 ;£ t iS T 



(1) b, 



= 1 



Kip = 0 



for 0 :£ t s: T 



<)> = 1 




for 0 t T - T. 



for T - ^ t T 



iP = 1 



(2) b. 




for 0 :£ t i T - 



for T - t ^ T 



.(f) = 0 






.ip = 

4 = 

.ip = 






for 0 i t T - T, 



0 

1 

0 

1 



for T - t T 



for T - i t T 



.<p = 1 



- Note: Extremals to C and D 

are symmetric to above. 



Table BI. Extremals and Terminal Surface Defined, 
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what Isaacs refers to as the non-useable portion of the terminal surface 
[50]. This concept is, however, not particularly useful in the solution 
of an attrition game. The concept of the domain of controllability for 
a terminal state is more useful. 

Determination of Extremal Strategies 

Table BI shows the five terminal states to the ('’reduced") support- 
ing weapon system game. Extremal paths are determined for a "reduced 
game," which is that part of the game for which a full allocation 
problem exists. For example, after y^ = 0, ODD uses (J) = 1 until 
even’s infantry is annihilated, and we only need consider up until that 
time. Moreover, to determine boundary conditions on the dual variables 
in the "reduced game," we must consider the payoff of the entire game. 

We discuss this point further in the next section. 

We will now outline the obtaining of extremal strategies when, 
for example, terminal state A is entered (EVEN wins by destroying ODD’s 
infantry), i.e., y^(T) = 0 and T is unspecified. In this case the 

objective function becomes: 

max min {-y^ (T) } . 

cf> ^ 

We introduce "costate" or dual variables, denoted by p^ , one for each 
state equation and representing rate of change of the game value to the 
players (here terminal payoff to the game) with respect to the various 
state variables. We now form the following Hamiltonian: 

H(t,y,p;<(),i|/) = + 4>y3(P4-P2) “ y4P3 " 

From this Hamiltonian we form the following "adjoint" equations: 
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dti 


1— 1 
Pu 
"O 

1 


■ 9yi 


dt 


8H 


_ dp2 


■ 9y2 


dt 




dp3 


9H 




■ 9y3 


dt 


9H 


_ dp^ 



= 0 =» P^(t) = const, , 



= 0 =» P 2 (t) = const. , 



= (()p2 + (1 -<t>)p^> 



8y, dT - ♦Pi + 

4 



(B4) 



with boundary conditions 



p^(t = T) = unspecified, 
P2<t = T) = -1, 

P3(t = T) = 0, 
p^(t = T) = 0. 



(B5) 



Extremal strategies (as a function of time) are determined from 

max min H(t ,y ,p ;<)) , which is equal to zero, since the terminal time 

<t)(t) 4»(t) 

is left unspecified. Thus we have 



max {(f)y 3 (p^-p 2 )} + min (P 3 ~P 3 ^) ) ~ y^P 3 ” Y 3 P^ = 0, (B6) 

where it is recalled that we must have 0 ^ (|) , ^ 1. 

Extremal strategies are determined by a backward integration of 
the adjoint equations (B4) with boundary conditions (B5) and considering 
(B6), since the boundary conditions of the dual variables are at the 
terminal surface. It is noted that for square law attrition that the 
adjoint equations are independent of the state variables (except for 
a boundary condition by a transversality relation) and so are the 
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extremal strategies. The domain of controllability for an extremal so 
determined is obtained by a forward integration of the state equations. 
The non-negativity of the state variables plays a central role in these 
determinations [74]. Details for the case at hand are presented in the 
next section. 

Concept of a ^^Blockable*^ Terminal State 

It may be shown that for many regions of the initial state space 
of this problem, there is more than one family of extremals leading to 
terminal states. The reason for existence of multiple extremals is that 
the min-max principle is merely necessary and of a local nature (see 
Athens and Falb [6] for a discussion of the corresponding situation in 
control theory). The attainable portions of the terminal surface are 
not ”close together” when multiple extremals are present. 

A solution aspect unique to terminal control attrition games is 
that in cases where there are extremals from the same initial point to 
different terminal states corresponding to the same player both winning 
and losing, entry to a terminal state may be ^blocked” by the "losing” 
player through use of an admissible strategy other than his extremal 
strategy. In other words, there is a path determined by the necessary 
conditions leading from each point in a region of the initial state 
space to a terminal state, but the "losing" player may use a strategy 
other than his extremal strategy to actually win. This behavior high- 
lights the local ("in the small") nature of the necessary conditions 
and the fact that the conditions are, indeed, necessary, i.e., assume 
that the losing player cannot prevent the terminal state from being 



reached. 
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e. Development of Solution . 

In this section we determine the optimal strategies from among 
the extremal strategies as discussed in the previous section. We also 
present the details of the derivation of extremals and domains of 
controllability . 

Determination of Optimal Strategies 

We now apply steps (3) to (6) of our solution procedure. Since 
the approach developed here may be used to show that Weiss’s original 
solution technique did indeed yield the correct solution to this parti- 
cular problem, the interested reader is directed to the original paper 
for the complete solution. We illustrate our procedure for the case 
when y° = y^/^. 

Application of step (3) yields the regions shown in Figure B1 with 

further details being provided by Tables BI and BII. It is noted that 

in region III, EVEN can "block” ODD’s steering the course of battle to 

y^(T) =0 by countering ODD’s strategy of (j) = 0 with ifj = 0 instead 

of using his extremal strategy ip = 1. Since EVEN has more air, he 

would win this strategic war. Hence, ODD would not consider trying to 

steer the course of combat to state D, since entry to this state is 

"blockable" for y° > y°. Table BII summarizes such considerations, 

4 3 

Discussion is still required on step (6) above for Regions I, II, III, 

IV, and V as shown in Figure 1, We now show that the "domain of control- 
lability" corresponding to a^ contains that of a^ and the payoff to 
a player 2 for extremal a^ is always greater than that for a^ ^ in 
these regions. Consequently, by applying the principle of optimality 
[9], extremal a^ may also be dropped from further consideration. For 
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Figure Bl. Regions for Determining Optimal Strategies. 
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extremal we have that 



\ - yVK ^33 ■ 



The domain of controllability is given by: 



S3^-(y“|y2>y;.y-3^yl.y2>yI 



o 

0 


r^i 


1 

o| 

V 

*<: 


lyp 



Similarly, for extremal a^ : 



, ■ yl^yv^a, = y3^y4 ■ ^1- 

(a^) 2 



= (y°|yj > yj,y; . yl,y° > 



(yp^+(yp2 ^ 



2 yl ’^4 



y,. ^ 



(y;)"+(yp^ 

2 y/° 



When 



y^ > y° (otherwise A is *'blockable'* for extremal a^) , we have 



that S 3 S . (PROOF: y°eS with Y/ > then y° ^ y is 

a^ 3,^ a^ ^ D J 1 



satisfied; also ^ 0 



(yp^+(yp2 

2 T! 



^ y 



ly/ 






yj 



(yp^+(yp^ 

similarly, y° > ^ y] 



ly/ 



; hence y°eS with y? > y^ =* y°eS . 

a^ ^ d a • ) 



We now consider the payoffs. Denote the payoff to player 2 for extremal 
a. by P . Then 

^3 

T» _ O O 

Similarly, it may be shown that 



(y;)^+(yp2 

P = yO ^ 

"2 "2 2 y» 
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It is easy to show that P > P for all y°€S D {y°|y/ > 

Since EVEN determines the choice of these extremals, will be 

chosen since it yields the largest payoff for EVEN. 

It remains to compare the payoffs to EVEN for a^ and in 

Region IV and V. It may be shown that 



^bi = ^2 - 



Hence for <1/2, we have that P < P, . Thus a^ is optimal 

yj >1 1 

in Region IV, but b^ is optimal in Region V. 

Derivation of Extremals and Domains of Controllability 

We provide details for terminal states A and B. 

Terminal State A : = 0 

At t = T, it is clear from (B6) that (j)(t = T) = 1. Combining 
this result with (B5), we have at t = T: 






Thus = 



3s 

^s 






and ijj(t = T) = 1. Then 



(t) = 



0 for P^(t) < -1 



1 for P^(t) > -1 



ana 



<J;(t) = 



0 for p^(t) > y 



3s 



1 for p^(t) < y 



4s 

3s 



4s 



There are now two separate cases which we must consider. We let 



T = T - t. The adjoint equations of interest become 
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dp, 

dT 



(j. -(1 



P^Ct = 0) = 0, 



i(T = 0) = 1 



"^^4 ^3s 

(1 

4s 



P^(t = 0) = 0, ij)(T = 0) = 1 



Case (a) ° ^4s 



ip changes first in T-time, call this 



For ^ T < T^, then P^(t) = - -|{x^ + 



/V - ■, 



3s 



ly 



P3(t) 






4s 



} , and for ^ t ^ T, 



'3s 



ly4si 



cosh(x - x^) + sinh(x - , and 



(x) = -cosh(x - x^) 



P 4 

Hence 



- T„) - 



_3s 



sinh(x - x^) . 



(a) 


for 


0 ^ X < X- = 


"^3s 


> 


<|) (x) 


= 1 and 4;(x) = 1. 






1 


^4s ' 




(b) 


for 


Xi ^ X < X 2 = 


L 


fy3sl 

^^4sJ 


2 

y 


(j)(x) = 1 and i|;(x) 


(c) 


for 


X 2 ^ X ^ T, 


<t>(x) = 


= 0, 


= 0. 



We now integrate the state equations forward using the above to 
determine the domains of controllability. When we employ <() = 1 and 



ij; = 1 for 0 ^ t ^ T, we have that y^ = y° and T = — o*- Using the 

3s •'3 

facts that x^ ^ T and y 2 (T) > 0 > we find that y® > y^jy^ ^ yl ’^2 ^ 



^3^ 



ivl) 



, and yj > y° 



lylJ 



3s 



When we employ (j) = 1 and = 0 for 0 t ^ T and 

^3s 

(j) = 1 and = 1 for T ^ t ^ T, it may be shown that y^^ = 

y3 ^4s ® 

and T = — . Using the facts that X- ^ T, x ^ T, and y<^ (T) > 0, 



we find that y^ > y^jy^ ^ 



(y°3)2+(yp2 



2 y° ’^4 



y,. ^ 



(y^2+(yp2 

27 ? 
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Case (b) ° ^43 ^ 3 s 

As above, we may show that 



(a) 

(b) 

(c) 



for 



for 



for 



0 ^ T < = 



^ T < 



^2 



T2 ^ T ^ T, 





4>(t) 

Y 



= 1 and = 



(|)(t) = 1 and 



1 , 

^(t) = 0, 



(|>(t) = 0 and = 0. 



Proceeding as before, when we employ cj) = 1 and = 1 for 



0 ^ t ^ T, we have that y, = y° and T = Using the facts that 

^ ^4 rV' 

^ T and > 0 , we find that y° < 73.73 > ^ 

1 o o ^ 

and > y^ 



lylJ 



ly 



When we employ c() = 1 and = 0 for 0 ^ t ^ T - 



4 s 



and 



4 s 



^ t ^ T, it may be shown that T = 



c() = 1 and = 1 for T - 

^ 45 “^ o ^ 

Using the fact ■" ” ^3’ shown that y° > Y^jY^ ^ 

73,72 > 73. and (74)^ > 



Terminal State B ; 

For this case the values of the adjoint variables on the terminal 
surface are: 

p^(t = T) = 0 
p^Ct = T) = -1 

p^(t = T) = unspecified ^3^^ = T) = 0 
p^(t = T) = 0 

It is noted that p^(t = T) = 0 even though Yj^(^ = T) = y°. The 
reason for this is that we must consider the payoff of the entire game 
to determine boundary conditions for the "reduce game," as noted above. 
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Thus, we must set == T) = 0, since ODD must lose all his infantry 

after his air has been lost and thus has no value for infantry without 
air. 



Subsequent details are similar to those for terminal state A. It 
may be shown that 

(a) for 0 T < = 1 and = 0, 

(b) for T T, 4)(t) = 0 and 4 ^(t) = 0. 

When we employ <() = 1 and ij; = 0 for 0 ^ t T, we have that 



T = —o'* Using the facts that 
73 < *^74 and 2 > (y^)^. 

need not be worked out, since B 
It is noted that terminal states 



> T and y 2 (T) > 0, we find that 
The case with the transition surface 
is "blockable” due to y® ^ y°. 

C and D are symmetric with A and 



B. 

f . Structure of Optimal Allocation Policies . 

Three characteristics of the solution to the supporting weapon 
system game are that the optimal strategies are: 

(1) either 0 or 1, 

(2) constant over time (no transition surfaces), 

(3) dependent on initial strengths. 

The first characteristic is a consequence of square-law attrition, 
which makes the existence of a singular control [53] impossible and 
hence strategies are extreme points in the control variable space. 
Singular control is, however, possible when there is linear law 
attrition for the target types over which fire is distributed. 

It is conjectured that the absence of transition surfaces in the 
solution is the consequence of two factors: (a) the problem is a 
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terminal control one and (b) only one target type is in the payoff. 

In a similar one-sided Problem [52], [74], such a switch in tactics 
only occurs in a losing cause when both target types are weighted in a 
terminal payoff. If we were to consider a prescribed duration battle, 
then it may be shown that transition surfaces may occur for both sides 
(compare with Isaacs’ [50] War of Attrition and Attack). Inclusion of 
only infantry in the payoff has the effect, in this case, of causing 
air to always be direct at infantry during the last stages of battle. 

It is conjectured that there can exist transition surfaces in the solu- 
tion when all target types are weighted in the payoff. When this is 
done, however, it may be shown that Weiss’s change of variables is 
inappropriate (payoff must also be transformed) , and the original formu- 
lation of the state equations with kill rate coefficients must be used. 

Finally, it may also be shown that for the prescribed duration 
battle target selection depends only on the attrition rates of the 
various force types and relative weights assigned to surviving force 
types. This should be contrasted with the terminal control case where, 
as we have just seen, tactics depend on force levels. Thus, we see that 
tactics depend on the circumstances under which the conflict ends, and 
Weiss has written a fundamental paper [83] on this topic. 

g. Extensions of Model . 

It seems appropriate to discuss two extensions of Weiss’ original 
model; one extends the type of payoff and the other modifies the infor- 
mation set available to the players. This second extension is believed 
to be more descriptive of the deployment of a supporting weapon system 
against ground forces. Complete solutions haven’t yet been developed 
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for either of these. Analytic details of parts of the solution to the 
first are presented in a section below. 

The first extension is the following: 
payoff to ODD: pXj^(T) + qx^(T) - rx^ (T) - sx^(T) with T unspecified 

subject to: x, = - a, x, 

114 

^2 = - ^^3 

^4 = “ ^^^ 2^3 

with appropriate initial conditions and terminal states as defined before, 
The reason for the re-introduction of the kill rate coefficients is 
significant and is discussed in the next section. 

It is conjectured that the optimal strategies for this problem 
may vary with time. The form of the payoff function has modified the 
marginal advantage of target engagement. This has been caused by the 
new terms in the payoff. Although the detailed solution has not yet 
been worked out, extremals so have time varying strategies. By our 
previous experience with the supporting weapon system game, we see, 
however, that this is not conclusive proof that the optimal strategies 
vary with time. One additional factor that we have at our disposal to 
induce the presence of a switching surface is the value attached to 
surviving forces. From our earlier experience with the fire programming 
problem, we would expect the shift in target engagement to apply for the 
loser (unlike the previous game) of the battle. He would, for example, 
allocate his air to the force type against which he had the greatest 
net effect in the early stages of battle and engage the force type for 
which the payoff (including kill rate) is greatest during the last stage 
of his losing effort. 
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The Hamiltonian for this first reformulation is 



H(t,x,p; 4 >,)|/) = il/x^Ca^p^-a^^Pj^) + ‘(>^3 

- I>2P4’‘3 

If we were to consider a battle of prescribed duration T, then we would 
have 

p^(t = T) = p 

p^(t = T) = - r 

P3(t = T) = q 

p^(t + T) = -s 

Optimal strategies (there is only one extremal) are determined from 



min[i|/x^(a2P3-a^p) ] + max [41X3 (b^P^+b^^r) ] - S2P3X - b2P^X3 

i> 4 > 

Hence 

(p = {sgn[b2P^ + b^r] + l}/2 

ip = {sgn[aj^p - a2P3] + l}/2 

where 



if X > 0 

sgn X = / 

^-1 if X < 0 

It may be shown that (f)(t) can only change from 0 to 1 if it does, 
indeed, change during the course of battle and similarly for i|^(t). 

Thus an artillery system would never switch from fire support to counter- 
battery fire in a battle described by this model. 
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The second extension would replace the state equations by: 



X 

X 

X 

X 



1 

2 

3 

4 



-<j)b^X2X3 
-(1 - 

-(1 - (J))b2X^ 



For this model the Hamiltonian is 



H(t,x,p;(|),ij;) = ij^x^Ca^p^-aj^x^p^) + ((ix^ (b 2 P^-bj^X 2 P 2 > - 32^3^4 “ ^2^4^3’ 



and the adjoint equations are: 



Pi ' ♦“i^pi 
P2 ■ ♦'’l-sPl 



P3 - ♦'>i’‘2P2 * ('■“♦*'’2P« 
P4 " '•’Pl^lPl ''' *P"*''®2P3 



Since the adjoint equations now depend on the state variables, the 
resulting two-point boundary value problem does not possess a solution 
readily obtainable by elementary methods. 

The above is believed to be a more realistic model of the deploy- 
ment of a supporting weapon system against ground forces, since individual 
soldiers are not engaged as point targets in such combat situations. 

Weiss [82] has also shown that such a model applies to cases of partial 
information in the following sense: each supporting unit is informed 

about the general areas in which opposing infantry are located but is 
not informed about the consequences of its own fire. This version still 
maintains the complete information assumption for the supporting weapon 
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systems. It seems more realistic that intelligence efforts would be 
more intense on a supporting weapon system of large kill potential and 
that intelligence for ground forces would be primarily concerned with 
location of troop units (aggregates of troops in specific areas) rather 
than individual soldiers. 

We have also considered other extensions and have done further 
analytic work on solutions than is presented here, but we do not present 
this at the present. 

h . A Pitfall of Model Formulation . 

Weiss [82] transformed his state equations of combat by intro- 
ducing new variables which "absorbed” the kill rate coefficients. A 
pitfall of this procedure will now be discussed. It is easy to show 
that if the state variables are transformed, the payoff must also be 
appropriately transformed when a tradeoff exists between target types 
(all target types are present in payoff) . This point was not important 
for the original Weiss formulation, since only one target per side 
appeared in the payoff. Failure to note this point may lead to failure 
to identify all significant solution properties for optimal allocation. 
For example, in the fire programming problem for forces of equal value 
(payoff: ~ x^(T) - X 2 (T)) if the state equations were to be 

transformed to: 

= -(1 - 

^3 = -^1 - ^^ 2 ’ 

while the original payoffs were retained, then it may be shown that 
there is no transition surface in the solution under any circumstances. 
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It is conjectured that in the original version of the supporting weapon 
system game this aspect of model formulation would have also prevented 
the existence of time-varying optimal strategies under any circumstances. 

i. Battles of Prescribed Duration and Fights to the Finish . 

In this section we discuss some differences between the prescribed 
duration battle and the terminal control battle (a special case of which 
is the "fight to the finish"). We begin by contrasting various aspects 
qualitatively and then present some solution details for one of the 
model extensions mentioned earlier. We do so for both the prescribed 
duration battle and the fight to the finish. 

General Discussion 

Of prime interest to the operations research worker who seeks 
an understanding of complex phenomena, is the extent to which his choice 
of model influences this perspective. We shall see that what determines 
the end of a battle is very important to the combatants for their selec- 
tion of optimal tactics. We shall contrast the battle for a prescribed 
duration to the battle to a specified terminal state (in particular, 
the "fight to the finish"). 

In all cases, target selection depends on the marginal return 
for engagement. For the supporting weapon system game, marginal return 
is the rate of change of the value of the game (in terms of forces 
remaining) per unit of force allocated. It is measured by the product 
of the rate of change of this value per unit of force type (dual variable) 
and of the kill rate of this force type by the supporting weapon system. 
Air or infantry is engaged depending on the difference of such quanti- 
ties. Similar remarks apply to the fire programming problem. This 
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richness of interpretation of the dual variables is not present in the 
analysis of multimove discrete games [14], [15], [34]. A very signifi- 
cant point is that the type of model chosen (form of payoff function 
and planning horizon) may lead to a different evolution of marginal 
return. This is clear if one only considers the values of the dual 
variables on the terminal surface. In the terminal control case, such 
a value of one of the dual variables depends on initial strengths and 
the history of the battle through the transversality condition 
H(t = T ,y ,p ;c|) ,!p) = 0, whereas for the battle of prescribed duration 
such values are independent of initial strengths. 

In fights to the finish (extension one of section g) , a 
commander must estimate the most vulnerable part of the enemy force 
(both kill rate and force level) and then concentrate the entire fire 
of the supporting weapon system on this. The winner continues with his 
chosen strategy until the desired end is achieved. The loser may shift 
fire to minimize his losses depending upon the weights he attaches to 
remaining units of the winner’s force types and his effectiveness 
against each. For the battle of prescribed duration, on the other hand, 
target selection is independent of initial strengths or tide of the 
battle. If the battle lasts long enough, the optimal tactic may be to 
shift fire regardless of whether one is winning or losing. 

The fight to the finish is thus strongly dependent upon what are 
the conditions under which a battle is ended, ’’the terminal states of 
combat.” It appears that there is more research to be done in this 
important area, especially in view of the strong dependence of tactics 
on it as pointed out in this paper. The excellent paper of Weiss’ [83] 
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on Richardson ^ s data should be noted. The current development may be 
readily modified to termination at specified non-zero force levels. 

There are no mathematical complications from this change. 

Thus we conclude that a realistic model for optimal allocation 
must also consider the conditions under which the battle terminates. 

We could allow for replacements in such models. In such cases it might 
be appropriate to consider total losses as defining an additional 
terminal state. It may be necessary to consider different terminal 
states for each combatant (not symmetric). For example, we could con- 
struct a dynamic allocation model of guerrila warfare in which we might 
consider the terminal state for the insurgents as reduction to a speci- 
fied level (possibly zero) , while for the counter-insurgents (both sides 
being allowed replacements) the end of the battle might be determined 
by the length of the conflict (people get tired of war) and/or total 
losses. 

Of interest to the military tactician is whether target selection 
rules evolve dynamically with the course of battle. Mathematically, 
this may be stated as whether there is a transition surface in the solu- 
tion. For the terminal control problems studied here, such a shift has 
been conjectured to be present only in a losing cause. For battles of 
fixed duration, the solution behavior is signigicantly different with 
the possibility of transition surfaces being present for both sides. 
Development of Solution to Prescribed Duration Battle 

We consider the following problem (which has been formulated 
from ODD*s standpoint) 

max min{pXj^(T) H- qx^(T) - rx^(T) - sx^(T)} with T specified, 

(j) ip 
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subject to: ~ ”’^^1^4’ 

= -(1 - 

x^ = -(1 - <t>)b2X^, (B 7 ) 

with initial conditions 

x^(t = 0) = x^.x^Ct = 0) = x°,x^(t = 0) = x®,x^(t = 0) = x°. 

In the subsequent development we assume that all initial strengths are 
such that a state variable is never reduced to zero so that a "subgame” 
is entered. 

The Hamiltonian, H (t ,x ,p ;(f) ,i|j) , is given by 

H(t,x,p;(|),i(;) = (fix^Cb^p^-b^p^) + 4 <x^ (a^p^-a^p - S2P3^4 " 

The adjoint equations are thus given by 

p^ = 0 =9 Pj^(t) = const = p, 
p^ = 0 =» P2(t) = const = -r, 

P 3 ■ ■ ■ ♦>'’2P4’ 

P 4 ■ ■ ‘ (B8) 

4 

with terminal conditions 

p^(t = T) = p,P2(t = T) = -r,p^(t = T) = q,p^(t = T) = -s , 

SO that the Hamiltonian becomes 

H(t,x,p;<t>,ij;) = (fx^Cb^p^+b^r) + <J;x^ (a^p^-a^p) - a2P3X^ - b^P^x^, (B 9 ) 
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with the extremal strategies being determined by max min H(t ,x,p ; (f> ,ij;) , 

(j) ip 

Hence the optimal strategies (there is only one extremal) are given by 



<t>(t) = - 



0 for ^> 2 ?^ < 



^1 for b^p^ > 



and 



ii;(t) = 



0 for a^p^ > a^p 



.1 for a^p^ < 



(BIO) 



Let us note that at t = T, (BlO) becomes 



<()(t = T) = 



0 for bj^r < b^s 



U for bj^r > b^s, 



and 



azq > a^p 

^2^ < (Bll) 

which conditions the four cases we study below. 

We let T = T - t in order that we may integrate the adjoint 
equations backwards from the end of the battle where the boundary condi- 
tion is given for the dual variables. Then, we have for any x-time 
interval over which strategies are constant 
dp 

= 4>bj^r - (1 - 4>)b2P^ P 3 (t = 0) = q, 



ip(t = T) = 



0 for 



1 for 



dp^ 

— = -4>aj_p - (1 - ^)a2P3 



P^(t = 0) = -s, 



(B12) 
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where 4 >(t) and are given by (BlO). From (Bil) it is easily 

seen that there are four cases to consider. 

Case I. b^r < b^s and a^q > a^p 

We see that ((>(1) = i|^(T) =0, so that near the end of battle 
(B12) become 

3^ - PjCt - 0) - 

5; a^Pj P^Ct - 0) - -s, 

whose solution is easily seen to be 

P^(t) = q cosh/a^b^ x + s/b^/a^ sinh/a^b^ x, 

p^(x) = -s cosh/a^b^ x - q/a^Tb^ sinh/a^b^ x. 

Noting that p^(x)a 2 ^ qa^ > a^p and -p^(x)b 2 ^ b^s > b^r, we see from 
(BlO) that 4)(t) = i|^(t) = 0 for all te[0,T]. 

Case II. b^r > b^s and a^q > a^p 

We see that (|)(T) = 1 and i|^(T) =0, so that for 0 ^ x ^ x^ 
where x^^ is the time of the first switch (B12) becomes 

dT= = 0) = q 

— = -a 2 P 3 p^(t = 0) = -s, 

whose solution is given by 

P3(t) = b^rt + q, 

P^(t) = -T^a^b^r/Z - a^qx - s, 
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from which it is seen that (p is the variable which switches at 
which is the solution to 



-a2b^b2rx^/2 - ^ ^ 



(B13) 



It is easily shown that one ())(t) switches to 0 there are no further 
changes. Hence, we have shown that 

for 0 ^ t ^ T - : <p(t) = 0 and iKt) = 0, 

for T - ^ t ^ T : (|)(t) = 1 and ip(t) = 0, 

where is determined from (B13). 

Case III is similar to Case II. 

Case IV. bj^r > b^s amd a^q < a^p 

We see that (j)(T) = i|^(T) =1, so that for 0 ^ t ^ where 
is the time of the first switch (D12) becomes 



^P3 . 

^ = V 



P^(t = 0) = q 



— = -a^p p^(t =0) = -s. 



whose solution is given by 



P 3 (t) = bj^rx + q, 
p^(x) = -a^px - s, 



whence we see that is given by 



- min{ 



^iP - 



a^b^r 



b^r - b^s 



ajb2P 



(B14) 



We could show that both strategy variables eventually change to 0 (if 
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T is large enough). For example, if ip changes first at then 

we may show that for t ^ 

P^(t) = - a^qt - s - (a^p - ! {Idi^ , 

SO that continues to decrease and (j) may also change to 0. 

In this example we have considered we would then have 

for 0 ^ t ^ T - '• 4^(t) = 0 and i|^(t) = 0, 

for T-T^^^t^T-T^ : 4)(t) = 1 and t|^(t) = 0, 
for T - t ^ T : 4>(t) = 1 and i^(t) = 1. 

What we do want to point out from the above development is that 
the optimum allocation of fire is independent of the force levels and 
depends only on the attrition rates (and length of battle) , We also 
note that if q = s = 0 (only infantry weighted in the payoff), then 
Case IV above applies and the battle always terminates with the support- 
ing weapon system fires concentrated on the ground forces possibly 
preceded by a period of counterbattery fire. 

Partial Development of Solution to Terminal Control Battle 

We consider the following problem (again the payoff is from ODD^s 
standpoint) 

max min{px (T) + qx^(T) - rx (T) - sx, (T)} with T unspecified, 

(f) ip 

= -(1 - ip)a^x^^ 

= -(1 - (j))b2X^, 



subject to: 
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with initial conditions 



x^(t = 0) = = 0) = x°,x^(t = 0) = x®,x^(t = 0) X®, 

and terminal conditions similar to Weiss’s original problem (see Figure 
BI) . 

We will outline enough (hopefully) of the solution process to show 
points of difference with the prescribed duration battle. Within the 
framework of our solution procedure for terminal control attrition 
games (see Section d above) , we have done only the first step (identify 
terminal states and determine extremal paths) . 

As before, the Hamiltonian is given by 



= (|)X2(b2P^-b^P2) + i|^x^ ~ 



^2P3^ 



- '=2P4>‘3’ 



<B15) 



SO that the adjoint equations are given by 



P 

P 

p 

p 



1 

2 

3 

4 





3H 



3H 

3x, 

4 



0 => Pj^ (t) = const , 

0 p^ (t) = const , 

■ ‘t>)t>2P4> 

■ 'l')a2P3- 



(B16) 



From this point on the development is different for each terminal 
state. We illustrate by considering the case when EVEN wins by destroy- 
ing odd’s infantry, i.e., x^(T) = 0. The boundary conditions at the 



termination of the battle in this case are 
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p^(t = T) = unspecified , 
P 2 (t = T) = -r, 

?3( = T) = q, 
p^(t = T) = -s. 



x^(t = T) = 0, 



Extremal strategies are determined by max min H (t ,x ,p ;((> ,ij; 

(f> ^ 

equivalent to 



and 



max{4) (b^p^ + b^r)} , 



min{iKa2P2 “ 



and, hence, extremal strategies are given by 



(t) = 



and 



<Kt) = 



0 for b^P^ < -b^r 



1 for b^P^ > 



0 for a^p^ > a^p^(T) 



1 for < ajP^(T), 



At t = T , we have 



<|)(t = T) = 



0 for b^r < b^s 



1 for b^r > b^s, 



and 



/O for a^q > a^p^(T) 
' 1 for a^q < 



<l^(t = T) = 



2^ < aj^Pj^(T), 



, which is 



(B17) 



(Bi8) 



which gives us various cases to consider. 
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Since the termination time is unspecified, the following trans- 
versality condition must be satisfied at the end of battle 

H(t=T,x,p;ct),i^) = 0. (B19) 

We shall see that this condition has the effect of eliminating \j^(t) = 0 
as an optimal strategy for EVEN during the closing stages of battle. 

We consider two cases of terminating conditions effecting EVEN’s 
strategy variable ijj. 

Case A. a^q > a^p^(T) implying ij;(t = T) = 0 

We show that this case is impossible and drop it from further 
consideration. We have the following two cases to consider 

(a) b^r < b^s 

By (B18), we have (() (T) = 0 so that (Bl5) and (B19) require that 

where x. = x. (t = T) as used by Weiss. Since the above will, in 

general, not be satisfied, this case is impossible. 

(b) b^r > b^s 

By (B18), we have (p(T) =1 so that (Bl5) and (B19) require that 

-a qx + b-, rx = 0, 

2 4s 1 3s 

which likewise makes this case impossible. 

Case B. a^q < implying ij^(t = T) =1 

Again, we have two subcases to consider 
(a) b^r < b^s 

By (b 18, we have cf)(T) = 0 so that (B15) and (B19) require that 
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p^(T) = (b2sx^^)/(a^x^^), (b20) 

so that Case B is given by 

(b) b^r > b^s 

By (B18) , we have <f(T) = 1 so that (Bl5) and (B19) require that 
p^(T) = (b^rx^^)/ (a^^x^^) , (B22) 

SO that Case B is given by 

We will now investigate the above two subcases of Case B more 
fully. Before we do this, let us rewrite the last two adjoint equations 

(B16) in terms of the '^backwards time” x = T - t 

dp 3 

^ = (|)b^r - (1 - <l>)b 2 P^ P 3 (t = 0) = q, 

<^P4 

= -iJ^a^p^(T)-(l - 4')a2P3 P 4 ^'^ = 0) = -s (B24) 

As we have shown above, the terminal state x^(T) = 0 can only 

be reached ween a^q < a^p^(T) so that we have ij;(t = T) = 1. We 
continue with the two subcases above. 

(a) b^r < b^s and p^(T) = ^ 

‘ '’2=’'3s 

By (b 18) , we have (f)(T) = 0 so that near the end of battle by 



dP4 



(B24) we have 
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and ^ “a^p^(T)x - s < 0 for all t. 

Hence (|)(t) = 0 for 0 ^ t ^ T. We may show that ^(t) can switch to 
0 at T^, so we would have 

for 0 ^ t ^ T - : 4>(t) = 0 and i(^(t) = 0, 

for T - ^ t T : (j)(t) = 0 and i(^(t) = 1. 



Determination of the domain of controllability is quite messy in this 

case and we omit it at this time. 

(b) b-r > b^s and p. (T) = (b_rx )/(a.x ) so that 

' 1 2 1 1 is 1 4s 






By (B18) , we have (j>(T) = 1 



so that near the end of battle we have 



P^(t) = -a^p^(X)T - s 
or 

p^(T) = - s 

<()(t) switches to 0 at given by 



= 



(b^r - b^s) 



4s 

3s 



and to summarize 

for 0 ^ T < : ((>(t) = 1 

for < T : (|>(t) = 0. 

Other details are similar to previous case, 
j . Implications of Models . 

It seems appropriate to discuss briefly the general implications 



in the following areas: 
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(1) intelligence, 

(2) command and control systems, 

(3) human decision making. 

Even though the present models assume complete and instantaneous 
information, their solution does possess certain features capable of 
being projected to cases where uncertainty is present. The selection 
of tactics is seen to depend on a knowledge of the enemy’s strength and 
capabilities so that the appropriate target set may be chosen and optimal 
strategies determined. Previous models [14], [15], [34] (battles of 
prescribed duration) had not indicated such a conclusion but that tactics 
depended only on enemy and friendly capabilities and length of combat, 
not the initial force levels. For such models the estimate of the 
combat length is critical, since if one were to extend this time, the 
optimal strategies may have to be determined again from the beginning. 

The shifting of tactics with time (instantaneously in the model) 
indicates requirements for a responsive command structure. For the case 
studied here, the loser of a battle may receive more benefits from a 
command structure capable of implementing a change of tactics during 
the confusion of combat. 

Schreiber [70] has proposed "overkill*' as a measure of "command 
efficiency." His idea is to modify the description of combat to reflect 
differences in command and control capabilities. One uses a linear law 
(see Section g) when fire is not redirected from killed targets. How- 
ever, we don’t see the full implication of such diminishing returns in 
combat here. In Appendix C we shall see that when there is a linear 
law attrition process for the target types over which fire is distributed. 
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the nature of the allocation policy is fundamentally different. 

These models may be interpreted to show the value of human judg- 
ment in combat. They indicate, as does common sense and experience, 
that in battle a commander must use his judgment to ascertain to what 
end can the course of battle be steered so that he may devise his 
strategy accordingly. The demonstrated sensitivity of these models to 
many factors shows the importance of human assessment of a situation 
and value attached to forces remaining after the battle at hand. 

A further discussion is to be found in Appendix C. 
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APPENDIX C. Some One-Sided Dynamic Allocation Problems. 

In this appendix we examine a sequence of problems to study the 
dependence of optimal allocation policies on model form. The problems 
are for combat over a period of time described by Lanches ter-type 
equations with a choice of tactics available to one side and subject 
to change with time. We consider two types of choice problems: (1) 

target-type selection and (2) firing rate. 

In 1964 Dolansky [28] noted that the Lanchester theory of combat 
was insufficiently developed in the area of target selection for combat 
between heterogeneous forces (optimal control/differential games). This 
remark was based on consideration of work by Weiss [82] and Isbell and 
Marlow [52], both of which we have extended in previous appendices. 

Since that time no further examples have been published in the litera-- 
ture except for the ones in Isaacs* book [50]. This previous work had 
never systematically investigated the dependence of tactics on model 
form. 

With the first sequence of models our goal is to obtain insight 
into optimal target selection rules in real combat by gaining a more 
thorough understanding of some simple models and the solution character- 
istics of such models. To understand the operations of a complex 
system, many times the researcher examines a sequence of models of 
greater and greater complexity to try to see if he can discern a “law 
of nature.” In the first two models we shall see how the objectives 
of the combatants and the termination conditions of the conflict 
influence target selection through the evolution of marginal return. 
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Then we examine the effect of number of target types and type of 
attrition process. 

We then examine a sequence of models to see how ammunition 
limitations effect firing rates. The results of this section are of 
a more preliminary nature. Then we discuss two-sided extensions of 
such problems but point out the value of studying one-sided problems 
as considered in this paper. Finally, various implications of the 
models studied are discussed. 

a . Target Selection . 

The simplest situation of target selection that we could conceive 
of is one of combat between an X-force of two force types (for example, 
riflemen and grenadiers) and a homogeneous Y-force (for example, rifle- 
men only). This situation is shovm diagrammatically below. 




It is the objective of the Y-force commander to maximize his survivors 
at the end of battle at time T and minimize those of his opponent 
(considering weighting factors p, q and r). This is accomplished 
through his choice of the fraction of fire, (j) , directed at X^. There 
are several scenarios that we could apply to the above idealized combat 
situation: two of these are (1) a battle lasting a specified time, T 

or (2) a battle lasting until one side or the other was totally annihi- 



lated. We will now examine each of these. 
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1. Battle of Prescribed Duration, T . 
Mathematically the problem may be stated as 



maximize ry(T) - px^ (T) - qx (T) with T specified 

dx^ 

subject to: = -(f)a^y 

dx2 

(1 - ^)a2Y 



^ = -b^x^ - b^x^ 



where 



x^jX^^y ^ 0 and 0 s: cf) ^ 1 , 



p, q and r are weighting factors assigned to surviving forces, 

Xf , x^ and y are average force strengths, 

af, a^, b^ and b^ are constant attrition rates, and 

(f) is fraction of Y~fire directed at 



This problem may be solved by routine application of Pontryagin 
maximum principle [68] . The solution when ^ ^2^2 shown in 

Table Cl. The other case when ^ ^2^2 symmetric to this one. 

This present analysis ignores those subcases when a state variable is 
reduced to zero. 

The Hamiltonian for this problem is 



H(t,x,p,<t>) = 4)y(-a^p^ + i-^ 2 '^ 2 ^ ~ 

The extremal control is determined by maximize H(t,x,p,(f)) and 

Mt) 

hence 



'■ 0 for p^a^' < p^a^ 
*1 for p^a^ > p^a^ . 



4>(t) = 



Table Cl. Solution to Target Selection Problem Battle of Prescribed Duration 
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The adjoint differential equations (note that these are independent of 
the state variables) are given by 
dPi 



dt 

dp. 

dT 

dp, 

dt 



3x^ 




with 


Pj^(t = 


= T) = 


8 H 

8X2 


■ ‘' 2 P 3 


with 


P2(t = 


= T) = 


M = 

3y 




with 


P3(t = 


= T) = 



It is convenient to define v(t) = a^p^(t) - The condi- 

tion which determines the extremal control is then 

0 for v(t) > 0, 



Kt) = 



1 for v(t) < 0. 



Introducing the reverse time variable t = T - t, we consider the 
following equivalent system of differential equations: 



— '’ 2 P 3 

T, - V 2 



with p„(t = 0) = q. 



with PqCi = 0) “ t. 



dv 

dT 



= with v(t = 0) = -a^p + a^q. 



These equations may be solved to show that up until the first switch 
in tactics 



P^Ct) = r cosh/<))a^b^+(l-(ti)a2b2 t 



a^p+(l-(ti)a2q 



/ct)a^b^+(l-(())a 2 b 2 sinh/cfia^b^+(l-())) a^b. 



T . 
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It is easy to show that p^(x), ^ ^ ^ ^ 

T ^ 0. 

We see that consideration of the case ^ ^2^2 motivated 

by the coefficient of differential equation for v(t). 

There are two further cases to consider. 

Case (a) a^p > a^q 

We have that ())(t = 0) = 1, since v(t = 0) < 0. Now since 

P^Ct) > 0, we always have ^ < 0 and v(i) never can change sign. 

Thus, we never switch. Hence, for 0 ^ t ^ T , we have 4)(t) = 1. 

Case (b) a^p < a^q 

We have that (|)(t = 0) = 0, since v(i = 0) > 0. Since P 3 (*^) ^ 

we always have 4^ < 0, and we can have a switch in tactics. 

QT 

The backward time of this switch in tactics, t = , is deter- 

mined from the integration of 



dv 

di 



(^ibi 



where it is recalled that 
shown that 



- a2b2>P3 



<j)(T) = 0 



for 0 i T i 
in this interval. 



It is easily 



v(t) = -(a^b^-a^b^H 






sinh /a_b^ t+ cosh/a.b_ t} - a.p + , 

/I37 2 2 b2 2 2 1 b^ 



2 2 



Thus, we determine from the transcendental equation v(x = x^^) = 0, 

and the result shown in Table Cl is obtained. 

It is seen that for the battle of prescribed duration target 
selection depends only on the attrition rates of the various force types 
and relative weights assigned to surviving force types. For this model. 
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target selection is independent of force levels. This is not surprising, 

since the adjoint differential equations are independent of the state 

variables and the values of the dual variables at the end of battle 

t = T are independent of force strengths. It is recalled that a dual 

variable represents the rate of change of the payoff with respect to a 

particular state variable [12], Thus, if V = ry(T) - px^(T) - y 

9 V 

then p- (T) = -r — (t) , etc. Hence the boundary conditions are given for 
1 o 

the dual variables at the end of the battle t = T as Pj^(t = T) = 

9 V 

■^(t = T) = -p,P 2 (t = T) = -q,p^(t = T) = r. 

It seems appropriate to discuss further the interpretation of 
the solution shown in Table Cl, From the above definition of the dual 
variables , 



3lPl(t) = 



(■return per unit time- 




'kill rate of Y> 




for engaging 




against X^ 


X 



of destroy 



Hence, the condition a^p < a^q means that at the end of the battle 
(recall that p^^(t = T) = -p, etc,) there is greater payoff per unit 
time per soldier for Y to engage (short term gain at the end of 

battle). The value of the dual variable, for example, p^(T) also 
accounts for the effectiveness of X^^ against Y, The condition 
a^b^ > a^b^ may be interpreted to mean that there is more long range 
return for engaging X^, Thus, case A of Table Cl corresponds to where 
there is both more long range and also short range return for engaging 
X^, Case B corresponds to more short term gain at the end of the battle 



for engaging X^ > but more long range return for engaging X^^. When 



remaining forces at t = T are weighted proportional to their kill rates 
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against Y, i.e., p/q = then case A is the only one possible. 

A switch in tactics (target priority) is seen to occur for this model 
when more utility is assigned to survivors of a target-type than in 
proportion to their destructive capability (kill rate) per unit relative 
to other target types. 

The maximum principle may be interpreted as saying that a target 
type from several alternatives is engaged when such an engagement 
yields the greatest marginal return. It turns out, though, that the 
marginal value of target engagement evolves differently for different 
model forms. This is clearly seen when we examine the solution for a 
**fight to the finish.*' 

2. Fight to the Finish . 

We consider the similar problem of 

maximize ry(T) - px- (T) - qx^(T) with T unspecified 
<Kt) 

dxi 

subject to: — = -(j)a.y 

dt 1 

dx^ 

= -(1 - <j))a-y 



dt 



= -b X - b X 

dt 11 2 2 



Xj^.x^.y ^0 , 0 ^ <{) i 1 , 



and with terminal states defined by (1) x^(T) = x^(T) = 0 and (2) 

y(T) = 0. 

The terminal surface of this problem is seen to consist of five 



parts : 



89 



: x^(T) = 0, x^(T) > 0, y(T) = 0, 

: x^(T) = 0 before x^(T) = 0, y(T) > 0, 

: x^(T) = 0 after x^(T) = 0, y(T) > 0, 

: x^(T) > 0, x^d) = 0, y(T) = 0, 

: x^(T) > 0, x^d) > 0, yd) = 0. 

The above problem was first studied by Isbell and Marlow [52], 
and we develop its solution in detail in Appendix A. The solution to 
this problem when ^ ^2^2 shown in Table AI. 

In contrast to the battle of prescribed duration, it is seen 
that optimal target engagement may depend on initial force levels. When 
Y wins, he engages until depletion before X^. When Y loses, 

he may switch from firing at X^ entirely to firing at X^ entirely 
before the X^ force has been annihilated. This happens when survivors 
of force-type X^ are assigned utility in excess of their kill rate 
as compared with force-type X^, and certain relationships hold between 
initial force strengths. This dependence of the optimal allocation on 
initial strengths has been caused by the fact that values of dual vari^ 
ables at t = T are dependent upon values of the state variables. 

This happens in terminal control attrition problems where a value of 
a state variable is specified at the terminal surface (and hence the 
value of the corresponding dual variable is unspecified but may be 
determined from the transversality condition H(t = T,x,p,(j)) = 0). 



90 



3. Generalizations to More Target Types . 

It is of interest to inquire as to what solution properties 
generalize to more than two heterogenous force types. For combat 
described by a generalized Lanchester square law, it turns out that the 
"bang-bang” allocation, optimal control is an extreme point in the 
control variable space, will always be true. 

Let us consider the following prescribed duration battle model: 



n 



maximize 
4), (t) 



vy 



(T) - w.x.(T) with T specified 



i=l 



1 1 



dx . 



subject to: — = -<t.a.y for i = l,...,n 

dt 1 1 



ft =- 

dt 1 1 



i=l 



X ,y ^ 0 , (j) ^ 0 , and Z <l>. = 1 

i=l 



The Hamiltonian, H(t ,x,p ,(|)) , is given by 



n 



n 



H = - I^P^a.y - Z b,K,. 

1=1 1=1 

th 

where p^ is the dual variable for the i — state equation, 
application of the maximum principle, we are led to 



By 



n 



miniTTiize { 7 d).p.a.} 

A 1 1 1 

4 ) . 1=1 

1 



n 

J <|>^ = 1, ^ 0. 



subject to: 
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Let i be the index such that a ,p. = minimum (a . ,a p ). Then 

22 11 

, where 6.. is the Kroncecker delta and is equal to 1 for 
1 13 iJ 

i = j and is equal to 0 otherwise, and all fire is concentrated on 
one target type. 

It is of interest to ask whether the optimal tactic will always 
be to concentrate fire on only one target type (bang-bang optimal 
control). The answer to this question turns out to be ”no'* as the 
following simple example shows. 

4 . Linear Law Allocation . 

So far the state equations have described combat according to the 
Lanchester square law in which attrition of a target type is proportional 
to the number of each force type firing at it. Weiss [81] has given 
a thorough discussion of the conditions which lead to this. These 
conditions include that "each unit is informed about the location of 
the remaining opposing units so that when a target is destroyed, fire 
may be immediately shifted to a new target." It is noted that the 
control theory models which we have considered so far have implicitly 
assumed perfect information. 

Another model for attrition is the Lanchester linear law in which 
the average decrease of a target type is proportional to the product 
of the average number of targets remaining and the number of each force 
type firing at it. Such a dependence can arise under two general 
circumstances: (1) fire is uniformly distributed over a constant target 

area ("area fire") or (2) the mean time of target acquisition is much 
larger than target destruction time and is inversely proportional to 
target density. The first circumstance corresponds to the simplest case 
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of partial Information . Again quoting Weiss [81], we assume that units 
are informed about the general areas in which opposing units are located, 
but are not informed about the consequences of their own fire. Thus, 
we see that we may account for some changes in the information set by 
modifying the description of combat. Brackney [22] has shown that 
"aimed fire" may lead to a linear law when target acquisition times are 
considered. 

Thus, we consider the following problem in which the X-forces’ 
attrition obeys a linear law and the Y-forces* attrition obeys a 
square law: 

minimize ry(T) - px^ (T) - qx (T) with T specified 

Ht) 



subject to: 



‘*’‘1 ^ 

jj 

d*2 

(1 - 






Xf,X 2 ,y ^ 0 and 0 ^ (j) ^ 1. 



All analytical details of the solution to the above problem have 
not been worked out, since the state and adjoint equations do not 
readily yield an analytic solution. However, it is possible to discuss 
qualitatively the nature of the optimal control, even though certain 
quantities have not been explicitly evaluated. 

There is a major difference in the solution to this problem from 
the previous ones. This difference is that the optimal allocation, (|> , 
may be other than 0 or 1. The Hamiltonian for this problem is given 



by 



93 



H(t,x,p,4)) = (-p^a^x^y + 



^-P2^2^2^ - 






b2X2)} 



(Cl) 



and hence under "normal" circumstances the control is determined by 



(j)(t) =- 



0 for ?2^2^2 Pl^l\ 



1 for P232^2 ^ Pl^l\ 



(C2) 



The adjoint equations are given by 



or 






= -{-p^aj^y<|> - p^bj^} 

= -{-P2a2y(l -(!>)- 

= -{-p^(|)a^x^ - p^Cl 



P3t>2) 



dt 

dt 

dp3 



dt 



+ P3bj^ 

p^Cl - <p)a^y + 

Pl**^^!^! ■'■ P2^^ ~ <l>)a2^2 



p^(t = T) = -p, 

p^Ct = T) = -q, 

P 3 (t = T) = r, (C3) 



In contrast with the previous problem, it is now possible to have other 
than a bang-bang optimal control. We may have a singular solution [53] 
for which the necessary condition that the maximization of the Hamiltonian 
(with respect to the control variable) does not provide us with a well- 
defined expression for the extremal control. This occurs when the 
coefficient of (j) in the Hamiltonian vanishes for a finite interval 



of time. 
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A singular extremal is determined from the conditions [54] 



3 (|) 



0 



and 



dt B(j)j 



0 



Hence, the following conditions must hold on a singular surface: 



= 



P2^2^2 



and 






(C4) 



On the singular surface, 



the extremal control is given by 



({) = 



ai + aj 



(C5) 



It may also be shown that such a singular control is impossible for 
problems al and a2. Thus, singular control (non-concentration of fire 
on only one target type) is impossible for Lanchester square law 
attrition but does play a central role in allocation when attrition 
follows a linear law. 

We must test to see if this singular solution can yield the 
optimal return. A necessary condition for a singular subarc to yield 
the maximum return [57] is 



9 (d2 






[9f 



} ^ 0 . 



A rather laborious computation shows that 



iLrii 






8H 



8 d^ 

and hence for p^Ct) >0, we have that 

J d(p dt 






} > 0. Thus, since 



it may be shown that P 3 (^) > 0 always, the necessary condition is 



met for the singular path to be optimal. 
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In constructing the extremal trajectories and tracing the optimal 
course of battle (backwards from the end of the prescribed duration 
battle) it is convenient to introduce 



v(t) = + ^ 2 ^ 2 ^ 2 ' 



(C6) 



then 



dv 



^Pi 



dt '“l dt ’‘l ■ “l^'l dt 



dKj <iP2 ‘‘=‘2 

— * ^2 dT’^2* ^2^2 IT- 



Using the state equations and the adjoint equations (C3) , we obtain 
from the above 






or, in terms of the backwards time t = T - t, this becomes 



It ■ <“2'’2^2 ■ 



(C7) 



We may write (C6) as 



v(t) = - 



P2('t) 






P2(t) 

“bTr ■ *2'’2’^2 



(C8) 



We note that (C2) and (C6) may be combined to yield the non-singular 
control 



Mt) = 



1 for v(t) > 0 



0 for v(t) < 0, 



(C9) 



and the singular control is 






Mt) = 



for v(t) = 0, 



(CIO) 
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when the system is in the state described by (C4), 

We note that at the end of battle t = 0, we have 

v(t = 0) = -a^px^(t = T) -h a^qx^Ct = T). (Cll) 

If we were to consider in Figure Cl the line L’ defined by ^^^^1 ~ 

would appear above, on, or below the line L defined 

by “ ^ 2 ^ 2^2 whether ^ were greater than, equal 

^1 

to, or less than — . This is evident from considering the slopes of 
these two lines 



rdx^>i 


a, b. 


rdx^'v 


2 


_ 1 1 


2 


[dx^J 


a,b, ’ 


dx- 


L 2 2 





L' 



V 

azq 



and hence, for example. 



dx^ 



dx 



^ L’ 



^ 

dx j q b 

L 



The significance of the line L’ and its relationship to the line L 
is that 



v(t = 0) ^ 



> 0 below L’ 

^ <0 above L' , 



and hence by (C9) we find that 



<^(t = T) = 



1 for P(T) below L' 



0 for P(T) above L' , 



(C12) 



(C13) 



Case (a) 
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Figure Cl« Optimal Allocation for Linear Law Attrition 
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where P(t = T) = ” T),X 2 (t = T) ) . We also note from (C7) that 



dv 

dt 



(t) 



> 0 below L 



< 0 above L. 



(C14) 



Thus, (C12) and (C14) give us three cases to consider 



Case (a) 



q bz’ 



Case (b) ^ 

q bz 

b 

Case (c) . 



We consider Case (a) first. The solution for this case is shown dia*- 

grammatically in Figure Cl. Even though explicit expressions have not 

been obtained for the state and adjoint variables, the dependence of 

the control on these quantities can still be discussed. It may be shown 

that the optimal control depends on the state variables x^ and x^ 

(and also attrition coefficients) in each "decision region." Above 

the line a^b^x^ = a^b^x^, denoted by L, the control (j) = 0 is 

used until this line is encountered. When L is reached, the singular 
a^ 



control (p = 



^ + ^2 



is used until the end of the battle at t = X. 



The above type of solution holds for arbitrary initial values of 

and Xz : x^(t = 0) = x° and Xz(t =0) = ^2* time history of the 

optimal control is traced for two particular initial force ratios shown 

" 2*^2 

as point A and point B. At point B, — > — — and hence (p - 1 

X2 

is used until the line L is encountered, 

b^ 

For Case (a) : ^ = — , the above statements are proved as follows, 
q b^ 

At T = 0 equation (C8) reduces to 
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v(t = 0) = [a^b^x^(t = T) - a^b^x^Ct = T) ] . (C15) 

From (C15) we see that there are three cases to consider depending on 
the sign of the term in square brackets. 

Case (1) = T) = ^ 2 ^ 2 ^ 2 ^^ ^ 

We see that this corresponds to when the system ends up on the 

^2 

singular subarc. In this case d)(t = T) = ; , and we continue 

ai + a^ 

(in backwards progression) to use the singular control c(>(t) = a^/Ca^+a^) 

(note that 4^ = 0 when this is used and that we had v(t = 0) = 0) 
di 

until x^(t) = x° or ^ 2 ^^^ ” ^2' yields three further subcases. 

Subcase (lA) ^l^l^l 

Define t^ as t such that > 0) = x°. Then we use 

4> = 0 for 0 ^ t ^ t^. This is consistent since v(t = T - t^^) = 0 

and 

^ T - s: T s; T 

is negative which implies v(x) < 0 and hence c()(t) = 0. 

Subcase (IB) ^l^l^l ^ ^2^2^2 

Define t^ as t such that x^(t^ > 0) = x°. Then we use 

(j) = 1 for 0 ^ t ^ t^. This is consistent since v(t - T - t^) = 0 

and 

S' = ■ ^2^2^P T - s: T ^ T 

is positive which implies v(t) > 0 and hence (})(t) = 1. 

Subcase (1C) ^l^l^l ” ^2^2^2 

We use c()(t) = a^/(a^ + a^) from the beginning. 



100 



Case (2) a^b^x^(t = T) < a^b^x^Ct = T) 

Since v(t = 0) = (^) ~ ^ 2 ^ 2^2^ battle 

we have <j)(T = 0) = 0. We work backwards from the end. Since we are 

above the line L, 4^ = p^(a_b_x_ - a_b-X-) < 0 and hence v(t) < 0 

dT illl III 

for all Te[0,X]. Thus we have <f)(t) = 0 for 0 ^ t ^ T. 

Case (3) = T) > 

Since v(t = 0) = (^) ^ battle 

we have <|)(t = 0) = 1. We work backwards from the end. Since we are 
below the line L, ^ ” ^2^2^2^ ^ ^ hence v(t) > 0 

for all Te[0,T], Thus we have <j)(t) = 1 for 0 ^ t ^ T. 

The above cases are shown in Figure C2. It is to be noted that 
in the above development we have made use of the fact that P 3 (^) ^ ^ 
for all t. 



We now consider Case (b) 




There are two cases to be 



considered . 

Case (1) never on singular subarc for finite interval of time 
Again there are two subcases to consider, depending on whether 
the system winds up above or below L, 

Subcase (la) a^b^x^(t = T) > a^b^x^Ct = T) 

Since 



v(t) 



r2] 




CM 


jbi/b2) a^b^x^^ 



we see that v(t = 0) > 0 and hence by (C9) (|)(t = 0) = 1. Since 

” ^2^2^2^ ^ ^ when we are below 



Case (a) 
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Figure C2. Battle Histories for Prescribed Duration Battle. 



L and we stay there by rising we have v(t) > 0 for all 

Te[0,T]. Thus we have (f)(t) = 1 for 0 t ^ T. 

Subcase (lb) a^b^x^(t = T) 

Again there are two further subcases to consider, depending on 
whether the system winds up above or below L*. 

Subcase (Ibl) a^b^x^(t = T) < ~ 

^iPXiCt = T) < a^qx^(t = T) 

In this case we wind up above L’ . Since v(t) is given by 
(C6), we have v(t = 0) <0 and hence by (C9) <})(t = 0) = 0. Since 

we are above L, ^ (given by (C7)) < 0 for all Te[0,T] and hence 

v(t) < 0 for all Te[0,T]. Thus we have (J) (t) = 0 for 0 ^ t ^ T. 

Subcase (Ibll) a^b^x^(t = T) < = T) and 

a^px^(t = T) > a^qx^(t = T) 

In this case we wind up below L’ at the end. Since v(t) is 
given by (C6) , we have v(t = 0) > 0 and hence by (C9) (J) (t = 0) = 1. 

We work backwards from the end. Since we are above L, 4^ < 0 while 

we remain above L. Thus v(t) decreases for t > 0. There are two 
further subcases depending on whether v(t) decreases to zero before 
the line L is encountered. Let be such that =0. If L 

has not been reached at then v(t) for t > is negative and 

(|)(t) = 0 until the beginning of battle. It is also possible to reach 
L just at v(t^) = 0. In this case (assuming we don’t remain on 
singular subarc) v(t) > 0 for t ^ since we pass below L and 
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Case (2) on singular subarc for finite Interval of time 

This can happen only when a^^b^x^Ct = T) < ^ 

aipXi(t = T) > a^qx^Ct = T). As usual, we work backwards from the end 
of battle. We use c|)(t) = 1 for 0 x x^, and at x = x^ we 

must have a^^b^x^Cx^^) = a^b^x^Cx^). We use the singular control 
(j)(x) = a^/ (a^ + a^) for x^^ ^ x ^ x^. There are three further subcases 

( 1 ) "" ^1 ’ ^ 2 ^^ 2 ^ ^2 ’ 

(2) ^1^^2^ ^1 ’ ^2^^2^ ^2 ’ 

( 3 ) ^ 1 ^^ 2 ^ ^ ^1 ' ^ 2 ^^ 2 ^ ^ ^2 ’ 

We omit the trivial discussion of these cases. 

Thus we see from the above that there are six possible cases for 
the history of combatant force strengths in the battle of prescribed 
duration: 

(1) started below L and never reached L, 

(2) always above L* , 

(3) started above L* and end up above L but below 
without ever reaching L, 

(4) end up above L but started below L and did not remain 
on L for finite interval of time, 

(5) started above (or on) L and were on L for finite 
interval of time, 

(6) started below L and were on L for finite interval of time. 
These six cases are shown in Figure C3. The reader should compare the 
solution we have sketched here with that of Bellman’s continuous version 
of the strategic bombing problem (see [9] pp. 227-233). Case (c) : 




is similar to Case (b) . 



Case (b) 
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Figure C3. Optimal Allocation for Linear Law Attrition 
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The reader attention is directed to the interpretation of these 
three cases. Case (a) is when Y assigns utility to surviving X-force 
types in exact proportion to their destructive capability against Y. 

Case (b) is when Y assigns a greater utility to surviving ^ than 

in proportion to their kill rate against Y relative to that of 
It is recalled that similar type remarks were made with respect to the 
solution of problem al. 

b . Effect of Resource Constraints . 

In this section we will examine a sequence of models of increasing 
complexity for which the effect of ammunition limitations on firing 
rate (fire discipline) will be explored. In each case, we consider two 
homogeneous forces engaged in combat described by a square law. The 
research on these models has not progressed as far as that on the earlier 
ones. For some of these models the results are of a preliminary nature, 
the entire solution not having been completely worked out. 

1 . Battle of Prescribed Duration with Constant Kill Rates . 

We consider the situation 

maximize px(T) - qy(T) with T specified 

4>(t) 



subject to; 



dx 

IT = 



dy 



dz 

dt 



= 0V 



z,y 0, 0 ^ (}) ^ 1, z(t = 0) = 0, and z(t = T) ^ A < vT = v 



dt, 



0 



where v is the maximum firing rate of each X unit. It is noted that 
the nature of the attrition coefficients a^ and a^ is different, 
since a^ has incorporated in it a constant firing rate. 
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This corresponds to the case where each X combatant has a limited 
supply of ammunition, denoted by A. We assume that this supply is such 
that he could not fire at his maximum firing rate for the prescribed 
duration of the battle, for when A k vT it is easily seen that the 
optimal strategy is to fire at the maximum possible rate, (f)(t) = 1 
for 0 ^ t ^ T. 

The optimal regulation of firing rate turns out to be 

A 

d)(t) = 1 for 0 ^ t ^ T. where T^ = ^ 

^ 1 1 V 

(f)(t) = 0 for T^ ^ t ^ T. 

This was determined as follows. The Hamiltonian is given by 
and hence 

^0 for p^ < P2^2^ 

1 for p^ > p^a^x- 
The adjoint differential equations are given by 
• 3H 

p^ = - — = with p^(t = T) = p 

P 2 = - ^ = Vl = T) = -q 

p^(t) = const. 

We introduce the reverse time variable t = T - t and consider a 

backwards integration of the state and dual variables from the fixed 

dPl 

end of the battle, t = T. Hence, = -<^va^p^, etc. It is easy 
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to show that Pj^(x), x(t), and y(t) are non-decreasing functions 

of T (regardless of ())) with p. (t = 0) = p, x(t = 0) = x , and 

1 s 

y(x = 0) = Similarly, ^ strictly decreasing function 

of X. Hence, Q(x) = a 2 P 2 ('r)x(x) is a strictly decreasing function 

of X with an initial value of Q(x = 0) = , p^ must 

be negative, and ^(x) never switches back to 0 once it becomes 1. 

This solution is distrubing, since it is not intuitively appealing 

to fire at one’s maximum firing rate until one runs out of ammunition 

and to spend the final stages of battle without ammunition. Hence, we 

are led to consider other models for further insight. 

2. Battle of Prescribed Duration with Time Varying Kill Rates . 

We consider the situation 

maximize px(T) - qy(T) with T specified 
^ (t) 

subject to: ^ = -a^(t)y 

^ = -^va^(t)x 
dz 

dV ■ 

x,y ^0, 0 ^ (j) ^ 1, z(t = 0) = 0, and z (t = T) ^ A < vT. 



It seems reasonable to assume that in mnay real world situations a^(t) 
and ^2^^^ would be monotonically increasing functions of time, e.g., 
two forces closing with each other. All the previous solution steps 
remain the same except for the effect of a^(t) and ^ 2 ^^^ increasing 
with time. This may change the solution markedly, although the optimal 
control is still bang-bang. The quantity QCx) = a^ (x)p 2 (x)x(x) is 
not guaranteed to be a strictly decreasing function of x, since ^ 2 ^^^ 
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is strictly decreasing (but positive) and negative. This 

allows the possibility that the optimal tactic may be to hold one’s 
fire and conserve ammunition in the early stages of battle so that 
^ (t = T) = 1 at the end of battle. 

The way in which ammunition is conserved depends on the specific 
nature of a^(t) and a^Ct). It seems worthwhile to explore optimal 
tactics for several simple time dependencies of these quantities, but 
this hasn’t been done as yet. We would recommend that this be a future 
research task. In Appendix D, we develop the solution to variable 
coefficient (either force separation or time as the independent variable) 
Lanchester-type equations when the ratio of attrition rates is a constant. 
This allows an analytic solution to be obtained for the problem at hand 
in special instances. It is not unreasonable to expect to encounter 
cases in which one holds his fire until the kill probability reaches 
some threshold value. An aspect that is disturbing is that the control 
has turned out to be bang-bang. One can show, in fact, that a singular 
solution is impossible for this problem. 

R. Isaacs has studied some similar problems in his book Differen - 
tial Games [50] and has explored some aspects of this problem much deeper 
than presented here. Isaacs tried to resolve the problem of shooting 
up all of one’s ammunition before the end of the battle by modifying 
the payoff. Another approach might be to consider a terminal control 
problem. 

3. Fight to the Finish with Limited Ammunition . 

Thus we are led to consider 

maximize px(T) - qy(T) with T unspecified 
<^(t) 
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subject to: 



dx 

dt 






y 



^ = 
dt 



-cfjva^x 




x,y ^0, 0 ^ <p ^ ly z(t = 0), and z(t = T) ^ A, 

with terminal states defined by (1) x(T) = 0 and (2) y(T) = 0. 

We briefly consider the constant attrition coefficient case, although 
it is noted that a similar analysis would apply to time dependent 
attrition coefficients. As with the previous terminal control problem, 
dual variables (marginal gains) now are related to the final values 
of the state variables by virtue of H(t,x,p,cf)) = const. = 0 = 

H(t = T,x,p,(f)). We might encounter a case where tactics are dependent 
on enemy force level (in the previous limited ammunition cases, tactics 
are independent of enemy force level), but this case has not yet been 
explored very far. 

One point worth noting is that for the constant attrition coeffi-^ 
cient case the X forces in order to win are required to have enough 
ammunition to fire at their maximum rate during the entire duration of 
the battle. Hence, we see that concentration of forces reduces the 
ammunition requirement per man, since the length of battle is determined 
by initial numbers of forces committed to battle. 

4 . Two-Sided Extension . 

There appears to be a novel feature in a two-sided version of the 
above problems. Again, we briefly make a few remarks about the constant 



attrition coefficient case. 
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maximize minimize px(T) - qy(T) with T specified 

Ht) ^(t) 

, . dx 

subject to: — = 



^ = 
dt 






du 

dt 

dv 

dt 



^^2 



x,y ^0, 0 ^ 4), 4^ ^ 1, u(t = 0) 

v(t = 0) 



0, u(t = T) ^ < v^T, 

0, v(t = T) ^ < v^T. 



Unlike the previous one-sided version of this problem, it is now possible 
to have <()(t = T) = 1 with limited ammunition. This possibility has 
arisen since the Y forces may hold their fire during the early stages 
of engagement. Questions now arise as to the advantage of delivering 
the first shot, e.g., is there a time lag before fire is returned?, and 
we move into the realm of games of timing studied at RAND [55]. 

c . Extensions to Differential Games . 

There is an intimate connection between the mathematical bases 
of opiimal control theory and differential game theory. It has been 
stated that optimal control problems may be viewed as one-sided differ- 
ential games for which the roles of all but one of the competing players 
have been suppressed [12]. A concise discussion of the inter-relation- 
ships between these two subjects is contained in Y. C. Ho’s [41] 
excellent review of Isaacs book [50] (see also Chapter 9 in [24]). 

If one takes a Hamilton-Jacobi approach to these variational 
problems, this relationship becomes particularly evident. In an optimal 
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control problem we are seeking the solution to the following partial 
dif f erentail equation for the optimal return, S (referred to as 
Hamilton’s characteristic function in the calculus of variations 
literature [69] )» 

— + maximum H(t,x,— ,(j)) = 0, 

Mt) 

with appropriate boundary conditions. In a differential game we seek 
the solution to 

3 S 3 S 

-r 1- maximum minimum H(t ,x ,— ;(() ,ij;) = 0. 

+(t) t(t> 

It also seems appropriate to mention the relationship of dynamic program- 
ming to these techniques. Consideration of the equation satisfied by 
the optimal return points out clearly an important aspect of dynamic 
programming, its being a discrete approximation technique for solving 
variational problems [30]. It is, however, a dual approach which 
generates an optimal trajectory as an envelope of tangents rather than 
as a sequence of points [10]. The value of the continuous models lies 
in their ability to exhibit explicitly the dependence of optimal tactics 
on model parameters rather than any computational ease. 

It is noted that the existing theory for differential games 
assumes that the optimal strategy (during any finite interval of time) 
is always a pure strategy. Hence, it is necessary that max min H = 
min max H almost everywhere in time. There are, however, differential 
games of practical interest for which pure strategy solutions do not 
exist [11] . 
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In light of the above discussion, it is easy to see the value of 
beginning the study of mathematical models of tactical allocation with 
optimal control. It is true that actual combat is a competitive environ- 
ment in which the actions of both parties must be considered, but optimal 
control problems may be used to study most significant aspects of such 
problems: setting proper boundary conditions, devising solution procedures, 

study of singular solutions, differences in solutions for different forms 
of model. Most solution aspects of the one-sided problem are present 
in the two-sided one. It is assumed that formulation of these two-sided 
problems is clear from the previous content of this paper. 

Of interest to the operations research worker is whether there is 
any new aspect of solution behavior in a differential game. The answer 
to this is "yes." In devising a rigorous solution procedure for the 
supporting weapon system game of H. K. Weiss [82], we have (see Appendix 
B) encountered solution behavior unique to terminal control attrition 
games: there may exist a domain of controllability for a given terminal 

state but entry to this state may be "blockable" by the "losing" player. 

In other words, there is a path determined by the necessary conditions 
leading from each point in a region of the initial state space to a 
terminal state, but the "losing" player may use a strategy other than 
his extremal strategy for this path to actually win. In the process 
of solving the supporting weapon system game and trying to understand 
the many complicated facets of its solution procedure, we gained 
insight by considering a related optimal control problem (see Appendix 
A), the Isbell and Marlow fire programming problem [52]. 
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d. Implications of Models , 

It seems appropriate to briefly discuss the general implications 
in the following areas of the models examined in this paper: 

(1) optimal tactical allocation, 

(2) intelligence, 

(3) command and control systems, 

(4) human decision making. 

The discussion of these areas is not mutually exclusive. 

Of interest to the military tactician is whether target selection 
rules evolve dynamically during the course of battle. Are target 
priorities static or do they evolve dynamically with the course of 
battle? With respect to optimal control models, this may be mathemati- 
cally stated as whether there are transition (switching) surfaces in 
the solution. We have seen in the idealized and simplified models 
studied here that target priorities do change. This is related to the 
evolution of marginal return of target destruction (value of dual 
variable). We have seen that this evolution depends on the goals of 
the combatants (utility assigned to surviving force types at the end 
of the battle) and also the conditions which terminate the battle. In 
the terminal control problem studied here, a shift in target priorities 
is present only in a losing case, whereas in a fixed duration battle 
such a switch is independent of winning or losing but depends only on 
weapon system capabilities and the prescribed duration of battle. 

Even though these models assume complete and instantaneous 
information, it appears that some inferences may be made for cases 
where uncertainty is present. In the terminal control case, we saw 
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that selection of tactics depends on a knowledge of the enemy’s strength 
and capabilities, since the terminal state of combat must be determined 
before optimal strategies can be. For a battle of prescribed duration, 
e.g., fighting a delaying action in a retrograde movement to protect 
the withdrawal of troops, tactics depend only on enemy and friendly 
capabilities and length of combat, not the initial force levels. For 
such cases the estimate of combat length is critical, since changes in 
target priorities are determined relative to the end of the engagement. 

Schreiber [70] has proposed an idealized and simple, but yet 
illuminating, way of quantitatively showing the value of intelligence 
and command control capabilities. He introduces the concept of ’’command 
efficiency,” which is measured by the fraction of the enemy’s destroyed 
units from which fire has been redirected. The effect of poor intelli- 
gence and poor capabilities for redirecting fire from destroyed targets 
is to produce ’’overkill.” Schreiber ’s equations for combat involved 
this fraction called ’’command efficiency,” and they reduce to Lanchester- 
type equations for area fire when the fraction is 0 and aimed fire 
for a value of 1. We have seen that the optimal tactics are quite 
different for these two cases. When intelligence and command control 
systems are very efficient, the optimal tactic is seen to be concentra- 
tion of fire on a specific target type. When capability for redirection 
of fire from destroyed targets is poor (either through damage assessment 

or constraints on new target acquisition) , the optimal tactic may be 

to allocate fire in a proportional fashion over target types in a way 

that holds the ratios of target density in each target area to be 

constant. Another implication is that supporting weapon systems (e.g, , 
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artillery) concentrate fire on selected point targets, but that fire 
is allocated proportionately over various area targets. Thus, these 
models suggest that the tactics of target engagement may vary with 
command and control capabilities. 

These models also show the importance of intelligence in devising 
the best tactics in combat. Intelligence on enemy weapon system 
capabilities (kill rates including target acquisition rates) and poten- 
tial length of engagement play a central part. We also have seen that 
for fights to the finish and linear law attrition cases intelligence 
on enemy force levels is also required. For artillery fire support 
missions against various troop concentrations, knowledge of troop 
densities is essential in the assignment of target priorities. Particu- 
larly dense concentrations where the initial kill potential is high are 
seen to be cases where the optimal tactic is to concentrate fire on one 
target for awhile. 

Another argument for the concentration of forces is seen to emerge 
from the study of these simplified models. When ammunition is limited, 
a concentration of forces has the effect of counter-balancing this 
constraint. For example, in a fire fight numerical superiority could 
mean that the enemy force level would be reduced such that he would 
disengage in time before the friendly ammunition restriction became 
critical , 

These models may be interpreted to show the value of human judgment 
in combat. They indicate, as does common sense and experience, that in 
battle a commander must use his judgment to ascertain to what end can 
the course of battle be steered so that he may devise his strategy 
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accordingly. The demonstrated sensitivity of these models to many 
factors shows the importance of human assessment of a situation and 
the importance of good judgment in assigning utility to forces surviving 
the battle at hand. 

e . Summary . 

The results of this appendix may be summarized as follows: 

(1) a sequence of one-sided models has been presented which shows 
that the tactics of target selection may be sensitive to 
force strengths, target acquisition process, the type of 
attrition process, and/or the termination conditions of 
combat , 

(2) a sequence of models have been presented which shows some 
preliminary results on the effect of resource constraints 
on firing discipline and concentration of forces, 

(3) tactics for target selection are heavily dependent upon 
"command efficiency," 

(4) concentration of fire on one target type among many occurs 
as an optimal tactic only when target acquisition is not 
subject to diminishing returns. 
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APPENDIX D. Solution to Variable Coefficient Lanchester-Type Equations. 

In Appendix C, we briefly considered a model involving Lanchester- 
type equations with variable coefficients. Although such equations 
have been studied by analysts for over 10 years since H. Weiss’ pioneering 
work [81] , analytic solutions for the average force strengths (state 
variables) as a function of an independent variable (either time or 
range) have been obtained in only isolated instances [19], [20]. We 
have discovered a very general method for solving such variable coeffi- 
cient equations under certain assumptions about the average attrition 
rates of the combatants. We point out, however, that all previously 
published results [73] except one are contained in the general results 
presented here. Additionally, these new results also apply to cases in 
which the relative velocity of combatant forces is a function of force 
separation. 

We show how to solve Lanchester-type equations for combat between 
two homogeneous forces when the attrition rates are variable provided 
that their quotient is a constant. Solutions are developed for either 
time or force separation as the independent variable. We also investi- 
gate under what circumstances each of Bonder’s two second order differential 
equations [20] can be transformed into a constant coefficient equation 
yielding exponential solutions. We begin by briefly reviewing previous 
work on this topic. 

H. Weiss [81] extended Lanchester-type equations to include the 
relative movement of two homogeneous forces, allowing time and space 
to be ’’traded” for casualties. He considered the two attrition rates 
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to be dependent upon force separation in such a way that their quotient 
was a constant. S. Bonder [19], [20] and others [73] have used Weiss’ 
extension to study the effects of mobility and various range dependen- 
cies of the average attrition rates on the number of surviving forces. 

For each force type, he developed a second order differential equation 
which related average force strength to the force separation, r, and 
obtained solutions for cases of constant relative velocity of forces. 

We show that more general results are easily obtainable by consid- 
ering the original first order system of equations with either time or 
force separation as the independent variable (as is appropriate for the 
problem under study). Bonder’s results [20] and the constant attrition 
rate solution are but special instances of our more general results. 

a. Range Dependent Attrition Rates . 

The case of range dependent attrition rates originally motivated 
this approach, although it is now seen to be a special case of time 
dependent attrition rates. We use the same notation as Bonder [20], [73] 
for the battlefield coordinates. 

We consider 



dx 

dt 



-a(r)y , 



^ = 
dt 



-3(r)x, 



where 



g(r) ^ ^ 

e(r) k„ 



and x,y are average force strengths, 

a(r),3(r) are average (range dependent) attrition rates. 
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Considering force separation, r, as the independent variable, we 

have 4^ = V 4^ and thus the equations become 
dt dr 



dx 

dr 



-k 



sSil 

a v(r) 



y. 



= 

dr 



-k 



g(r) 
8 v(r) 



X, 



(Dl) 



We consider the relative velocity of the forces to be a function of 
force separation only. As Weiss [81] has pointed out, these equations 
readily yield a square law relationship between the state variables 



kg(x2 - x2) = k^(y2 - y2). (D2) 

Solving equation (D2) for y, substituting the result into the first 
of equations (Dl) , and integrating from r = and x = x^ to r 

and X, we obtain 



in 



I 



K + 



”0 * >'o‘'yS 



Vk k 
a 8 



g(u) 

v(u) 



du 



(D3) 



Raising e to the power of each side of equation (D3) , we obtain the 
following result after some algebraic manipulation: 



x(r) = X cosh 0 4- y /k /k_ sirh 6 , 
U (J ot p 

where 



0(r) = 



-/k k. 
a 8 



du, 

v(u) 



(D4) 



A similar expression is readily obtained for y(r). Bonder’s [20] 
results are special cases of equations (D4) . 
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b . Time Dependent Attrition Rates . 

More generally, we might be interested in 









The same approach as above readily yields 



x(t) = cosh 0 + y /k /k sinh 0 
U 0 ot p 



where 



0(t) = -/k k" 

^ a B 



h(u)du. 



(D5) 



0 



When h(t) = 1, equations (D5) reduce to the familiar constant coefficient 

solution. When h(t) = g(r(t)) and r(t) = R + v(t)dt, equations 

^ i 

(D5) reduce to equations (D4) . ^ 

c . Some Comments . 

We see from the above that the effect of time (range) dependent 
average attrition rates of the form considered is to transform the time 
(range)scale of the usual square law attrition process. Thus we see 
that certain time (range) intervals are weighted more heavily in the 
transformed time (range) scale than they are in the usual square law 
attrition process. 

Previous analytic work [73] has assumed that the relative velocity 
between forces to be constant. These results allow this restriction to 
be relaxed. For example, we may now easily study combat situations in 
which relative velocity is a decreasing function of force separation. 
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We would strongly recommend that the results developed here be 
used in extensions of the allocation models developed in the previous 
appendix. The approach developed here also applies to the solution of 
the adjoint equations in the determination of our new dynamic kill 
potential developed in Appendix F, 

d . The Condition for Solution in Terms of Elementary Functions . 
We discuss in this section necessary and sufficient conditions 
for a second order ordinary differential equation which Bonder has 
derived [20] to be transformed to a constant coefficient equation 
yielding exponential solutions. This covers all but one of the results 
obtained by Bonder [73]. 

We start by considering 



dx _ __ g (r) 
dr V 



^ ^ ^ B(r) 
dr V 



X, 



(D6) 



which is implicit in the development of (Dl). By differentiation and 
substitution, we may combine these equations into a single second order 
equation for x. 



or 



d^x 

d7^ 



d^ 



dr 



+ 



dx d 
dr dr 




+ g (r) ^ = 0 

V dr 

g(r)g(r) ^ 

^^2 X 



which for v = constant (i.e., constant relative velocity of force 
movement) becomes 

d^x 1 da dx a3 ^ 

3“ T — -j- - X = 0 

dr^ a dr dr v^ 



(D7) 
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A similar equation is similarly obtained for y. 

In [40] p. 50 it is stated that a necessary and sufficient condi - 
tion to be able to transform the equation 

0 -^ 



into an equation with constant coefficients is that 



^1 2 a^ 




constant. 



X 

The desired substitution is given by Z = f (x) = ~ | /a^ (x) dx (where 

A is defined on p. 50 of [40]). This reference also gives the trans- 
formed second order equation in the new independent variable Z. When 
the above theorem is applied to (D7), we find out that (D7) can be 
transformed to an equation with constant coefficients if 



i 16 = i ^ 

B dr a dr’ 



which is easily seen to be equal to 

g(r) 



_d_ 

dr 



S(r) 



= 0 , 



or 



g(r) 

B(r) 



= constant. It is not surprising in view of our previous 
g(r) 



development that — equal to a constant is a sufficient condition 
for equation (D7) to be transformed into an equation with constant 
coefficients. The development of necessary conditions in the general 
case is more complicated. 

The above theorem from [40] explains why equation (10) of [73] 



has not yielded to solution when R ^ R . In this case it is seen to 

ot p 
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be impossible to transform the equation into one yielding exponential 
solutions. Our work here then confirms the conjecture made in [73] 
that the condition which facilitated the results obtained at the 
University of Michigan was that = constant. 

We also note that the transformations employed by Bonder [20] 
are readily discovered by p. 50 of [40] but omit the details. We have 
also briefly tried to solve equation (10) of [73] for / Rg by classi- 
cal ordinary differential equation methods (see [45] or pp. 530-576 of 
[65]). It appears that this equation is not a standard form and series 
methods must be used. Time has permitted only a very cursory look at 
this. 
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APPENDIX E. Connection with Bellman^ a Stochastic Gold-Mining Problem . 

In this appendix we solve several versions of a continuous stochastic 
decision process by means of the Pontryagin maximum principle. The basic 
problem has been called the continuous version of a stochastic gold- 
mining process (see pp. 227-233 of [9]), but it is really an idealiza- 
tion of an allocation problem for strategic bombers. We consider a 
decision being made sequentially and continuously over a period of time 
with the result of the decision not certain. We assume that we know 
the probabilities associated with each outcome. This type of problem 
is referred to in the economics literature as decision making under risk. 

This is the continuous version of a stochastic decision process. 

A discrete version has been formulated and solved (see pp. 61-79 of [9]). 
However, the continuous problem permits certain relationships between 
model parameters and the structure of the optimal allocation policies 
to be explicitly exhibited. This is not possible to the degree developed 
here for a dynamic programming numerical solution procedure. The type 
of idealization which leads to a simple analytical solution frequently 
provides insight into the fundamental structure of the optimal allocation 
policies . 

We consider a sequence of models. Two basic cases are allocation 
in the face of diminishing returns and non-diminishing returns. Two 
further subcases for each of these are prescribed duration use of a 
resource and also maximum return for specified risk. Thus we actually 
consider four models. There is a close relation between these models 
and their optimal allocation policies and the allocation problems in 
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combat described by Lanchester-type equations of warfare which we 
considered in Appendix C. This has been our motivation for the current 
development . 

First we give some background on the basic problem and then we 
develop the solution to each of the four problems. Then we summarize 
the solutions and discuss the significance of this work, 

a. Background . 

R, Bellman and R, S. Lehman did the original work on the "continuous 
gold-mining equation." The problem is actually to maximize the expected 
damage by a bomber by the proper choice of the bombing sequence of two 
target areas. The bomber, of course, is subject to being shot down. 

The problem was originally solved by Bellman and Lehman by use of varia- 
tional methods (the case of diminishing returns only) . In this solution 
process, they make use of knowledge of the solution to the discrete 
version of this problem. A significant point to note is that this 
problem (for the case of diminishing returns) has a singular solution 
(see [53]). This appears to be the first example in the literature of 
a problem with a singular control. It was correctly solved ten years 
before the first publication on singular control problems appeared [54]. 

We shall use the newer theory to solve it. The current approach provides 
more insight and also leads to a new interpretation of these problems. 

The case of non-diminishing returns was not previously solved (it is 
the less complex case) . 

The current treatment of these problems by the Pontryagin maximum 
principle provides further insight. We see that the problem referred 
to by Bellman as the infinite duration problem is actually the problem 
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of maximizing return for a specified risk. It is not essential that 
the problem last for an infinite length of time. 

We consider the case of non-diminishing returns to contrast its 
solution with that of diminishing returns. As we have noted previously, 
there is a close parallel between the solutions of these problems and 
the solutions to the fire programming problems considered in Appendix C. 
We may think of a square law attrition process as the case of non-dimin- 
ishing returns per unit of weapon system, whereas a linear law attrition 
process corresponds to diminishing returns per unit of weapon system. 

It appears worthwhile to further study the structure of such allocation 
problems and to further interpret the various structures of the optimal 
allocation policies. It also seems worthwhile to consider the inter- 
relationships between such problems in the literature, but time has not 
permitted this. 

The problem is to maximize the expected return for the use of a 
resource subject to loss (destruction or breakdown) by choice of the 
operating sequence in two deployment areas. The original motivation 
for this problem was the allocation of a bomber to strategic targets. 
Imagine that we had a bomber that we could send to either target A or 
target B. There is a return (fraction of strategic value destroyed) 
and a risk (probability of bomber being shot down) for each target area. 
The problem is to determine the tradeoff between risk and return. The 
reader is directed to pages 227-228 of [9] for the derivation of the 
models we consider in the next section. 

b . Development of Solution to Problems . 

In this section we present the development of the solution to four 
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versions of the continuous gold-mining problem. We consider the follow- 
ing problems 

(a) non-diminishing returns - prescribed duration use, 

(b) non-diminishing returns - maximum return for specified risk, 

(c) diminishing returns - prescribed duration use, 

(d) diminishing returns - maximum return for specified risk. 

1. Non-diminishing Returns - Prescribed Duration Use . 

We consider 



maximize 

Mt) 



p(t) + (1 - <)))r 2 } dt with T specified. 



0 



subject to: 



dx 

dF ■ 



^ + (1 - 4^)q2K 

x,y,p ^ 0 and 0 ^ ^ 1, 

with initial conditions 



x(t = 0) = Xq, y(t = 0) = y^, p(t = 0) = 1, 

where 

x,y are strategic values of target areas 1 and 2, respectively, 
at time t, 

p is probability that bomber survives until time t, 

^1^^2 rates at which strategic value is destroyed, 

^l’^2 rates at which bomber is shot down. 

In the present analysis we assume that neither x nor y ever becomes 



zero. 
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The Hamiltonian, H(t,x,p,<|)) , is given by 



H(t,x,p,4>) = p(t){4)r^+(l-4>)r2}- ~ 



The adjoint equations are given by 

• 9H „ . , 

p^ = - — = 0 =» p^(t) = const 

p^ = - -^ = 0 P 2 (t) = const 

P3 = - -^ = -(1 - <^)v^ + P 3 { 4 >qj^ +(1 - <t>)q2> 



Pj^(t) = 0 


since 


p^(t = T) = 0 




P2(t) = 0 


since 


P2(t = T) = 0 




dP3 

dT ■ P3‘»i’ 


(1 - ())){ 


~^2 ^3‘^2^ ^3^*^ = T) = 0 


(E2) 



Combining (El) and (E2) , we see that the Hamiltonian becomes 



H(t,x,p,4)) = p(t){(})r^ + (1 - <|))r 2 } - P 3 p{<Jiq^ + (1 - <t>)q 2 }- (E3) 

The optimal control (there is only one extremal) is determined from 

max H, which is the same as max{(()[rj^ - P^q^^l + (1 - <^)[r^ - P 2 q 2 ]}> 

<)> <|) 
since p(t) ^ 0. Hence, the optimal control is given by 



foi^ q2 > 



4>(t) = 



1 for p^(t) > 



0 for P 3 (t) < 



I2 ll 

^2 - ^1 
"2 - "1 
^2 - 'll 
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and 



for 



r 



(t) = 



1 for p_(t) < 



0 for p-(t) > 



^2 - ^1 

92 - 9 i 

^2 ~ ^1 
92 - 9 i 



(E 5 ) 



,3, 

We check to see if there is a singular solution [ 53 ] to this pro- 
blem. A more detailed discussion of singular solutions is to be found 
in Appendix C. A singular extremal is determined by the conditions [ 54 ] 



Using (E 3 ) for the problem at hand, we obtain 



d 






and 






dt ^’^l ^2 *^2^^ " P^'^l ‘^2^ dt 



which imply (ignoring pathological cases) 



dp. 



— = 0 = (|){-r^ + P39]^} + (1 - <|)){-r2 + P392} 



or that P3 = latter condition implies P3 = (J) = 0 

r r 
1 2 

(which is not a singular control). Thus, we see that unless — = — , 

9i 92 

an unlikely case, there is no singular solution . 

We develop the solution by working backwards from the end of the 
problem at t = T. It suffices to consider the case where 
There are two further cases to consider depending on whether r^ > r^ 



or ^3^ > ^2’ 

Case (a) 



^2 > and q2 > q^ 



r — r 
2 1 

In this case we have > 0 with q^ > Q-, • 

9o - 9n 21 
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Recalling that = T) = 0 and using (E4) , we see that (() (t = T) = 0. 

We introduce the backwards time t = T - t so that the adjoint equation 
(E2) becomes 

dp 3 

— - *(rj - Pjqj) + (1 - *)(r^ - p^q^}. 



Thus, up until the time of the first switch in tactics, which we denote 
by T^, we have 




^2 ” ^ 3^2 ^ 3 ^^ 



0 ) = 0 . 



Integration of the above yields 



^"2 "' l 2 ^ 

p (t) = ^ (1 - e ^ ). 
^2 



(E6) 



r ~ r 
2 1 

If ^11 T 0 , then we can never switch to (pCr) = 1, 

3 q2 - 

The above readily yields that we never switch from (j)(t) = 0 when 

r r r r 

2 1 2 1 
— > — . There can be a switch in tactics to = 1 when — < — . 



q2 qi 

however. The time of this switch, x^, is determined from 



q2 qi 



p (t ) = — (1 - e ) = - — ^ 7 — 

^2 ^ 2^1 



(E7) 



From (E7) the time of switch is readily computed to be 



T 



1 



iln 







(E8) 



For this potential switch to actually occur, the planning horizon, T, 
must be of sufficient length. The condition is that T - 0, which 

implies that for the switch to occur the planning horizon length must 



satisfy 



131 



^ ’^2^ ■ ^1^ ■ 



(E9) 



^"2 ’'l 

Assuming that T satisfies (E9) , then for — < — we have 

92 9i 

(|)(t) = 1 for 0 ^ t ^ T - T^, 



(t) = 0 for T - ^ t ^ T. 



(ElO) 



Case (b) r^ < r^ and 

r — r 
2 1 

In this case we have < 0 with > q^ • 

Recalling that P 3 (^ = T) = 0 and using (E4) , we see that c()(t = T) = 1, 
We introduce the backwards time x = T - t. The adjoint equation (E2) 
for the dual variable becomes 

dp 3 

— = ^{r^ - P3q^} + (1 - P3q2>. 

Thus, up until the time of the first switch in tactics, which we denote 



by T^, we have 



dp, 

dx 



= 1^1 - P3q^ with P3 (t = 0) = 0 , 



Integration of the above readily yields 



^1 -q,T 

Po(t) = — (1 - e ^ ). 

9l 

r — r 
2 1 

If Po(t^) > for all X ^ 0, then we can never switch to 

3 ^2 - qj 

c|)(x) = 0, The above readily yields that we never switch from c|)(t) = 0 

r r 
1 2 

when — > — , but this is precisely the conditions which define this 
^1 ^12 

case. Hence, there is never a switch in tactics and we have 
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(j)(t) = 1 for 0 t T. 



(Ell) 



2. Non-diminishing Returns - Maximum Return for Specified 
Risk . 

We consider 
T 



maximize 

4>(t) 



p(t){(j>rj^ + (1 - (t))r2}dt 



with 



T 



unspecified , 



subject to: 



dx 

dt 



-(l)r 



1’ 



dt 



-(1 - (J))r2, 



^ = -p{(|)q^ + (1 - <)>)q2}» 

x,y,p ^ 0 and 0 ^ (j) ^ 1, 
with initial conditions 



x(t = 0) = x^, y(t = 0) = y^, p(t = 0) = 1, 
and terminal condition 



p(t = T) = e > 0 (also e < 1) . 



As before, we assume that neither x nor y ever becomes zero. 

As before, the Hamiltonian is given by (El), but now the adjoint 
equations have the boundary condition on p^(t " T) unspecified. Thus 



dt 



p^(t) = const = 0, 
p^(t) = const = 0, 
(j){-r^ + P3q^^} + (1 - 



(E12) 



unspecified. 
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Since the termination time T is unspecified, we have the following 
transversality condition (using (E3)) 

H(t,x,p,4>) = 0 = p(t){(f)r^ + (1 - <(>)r 2 } - P 3 P{<l)qj^ + (1 - (E13) 

The optimal control is again given by (E4) and (E5). Again, it is 
impossible to have a singular solution to this problem. 

We develop the solution by working backwards from the end of the 
problem at t = T. By the symmetry of the problem, it suffices to 
consider the case where > q^^. There are two further cases to con- 
sider depending on whether r^ > or ^ ^ 2 ’ 

Case (a) and q^ > q^ 

In this case (E13) and p(t = T) = c > 0 yield 



<t>[-(r2 - + ?3(q2 " ‘^l^^ ^2 " ^3^2 “ 



(E14) 



r — r 
2 1 

From the definition of this case, we have > 0 with q^ > q, . 

q2 - «ii 2 ^1 

It is easy to show that we must have ^ 0. We prove this by 

contradiction. Assume that we had ^ Then we would have 

p^Ct) ^ 0 < — — - - so that by (E4) we obtain c()(t) = 0. Substituting 

this in (E14) we obtain 



P3(0 - ^>0. 

which contradicts our assumption. In particular, we must have 
p^(t = T) >0. There are two subcases to consider 



Subcase (1) 



r — r 
? 1 



By (E4) we have <})(t = T) = 1. We combine this with the 
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transversality condition (E14) to obtain 



p (t = T) = ^ > 0. 

3 



This in turn generates further conditions as follows 



(E15) 



r r = r 

— = p_(t = T) > ^ ^ 
qi 3 q - q 



li >I2 

qi ^ q 2 



which is easily verified to be consistent with Case (a). Using the 

obtained control and backwards time x = T ~ t, we have up until the 

time of the first switch in tactics, x^, from (E2) 

dp 3 r^ 

with PjCt = 0) . 



Integration of the above readily yields 

r. 



Po(t) = — = const. 

3 qi 



r r 
1 2 

Thus , we have for — > — 

qi q2 



I (t) = 1 for 0 i t i T. 



(E16) 



Subcase (2) 



P.(t = T) < 



r — r 
2 1 



qj - 

By (E4) we have c()(t = T) = 0. We combine this with the 
transversality condition (E14) to obtain 



p (t = T) = — > 0. 

^2 



(E17) 



This in turn generates further conditions as follows 



^^2 ^"2 ■ ""l ’'l ^"2 

— = p„(t = T) < ^ 

^2 3 ’^2 ~ '^l ^1 '^2 
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which is easily verified to be consistent with Case (a). Using the 
obtained control and backwards time t = T - t, we have up until the 
time of the first switch in tactics, t^, from (E2) 

dT ■ rj - Ps’d ° • 

Integration of the above readily yields 

r^ 



Po(t) = — = const. 

<l2 



r r 
2 1 

Thus, we have for — > — 

^2 ^1 



i(t) = 0 for 0 s; t i T. 



(E18) 



Case (b) < r^ and 

r - r 
2 1 

From the definition of this case, we have < 0 with 

^2 ^1 r„ - r. 



. It is easy to show that we must have p^(t) > . We 

2 ^1 ^ ^2 " ^1 

prove this by contradiction. Assume that we had 

r ~ r 

2 1 

p (t) ^ . Then by (E4) we would have (t>(t) so that (E14) would 

3 ~ q- 



yield 



P3<t) 0. 



which contradicts our assumption. In particular, we must have 

r — r 

2 1 

p (t = T) > ^ and hence ((>(t = T) = 1 by (E4). From (E14) we 

d q2 

obtain 

p (t = T) = -^ > 0. 

This in turn generates a futher condition as follows 



r r - r r r 

= p-(t = T) > ^ i ^ 



qo - 



qi q2 



which is easily verified to be consistent with Case (b). It is recog- 
nized that this case has turned out to be identical with Subcase (1) 

of Case (a) . Thus , we have for — > — , 

qi q2 

(})(t) = 1 for 0 s; t s T. (E19) 



3. Diminishing Returns - Maximum Return for Specified Risk . 



We consider 
T 



maximize 

c()(t) 



p(t){())r^x + (1 - (j))r 2 y}dt with T unspecified. 



0 



subject to: 



dx 



^ = 
dt 



-(1 - 4>)r2y, 



^ = -p{(})q^ + (1 - (J))q2}. 



x,y,p ^ 0 and 0 ^ ^ 

with initial conditions 



x(t = 0) = Xq, y(t = 0) = y^, p(t = 0) = 1, 
and terminal condition 

p(t = T) = e > 0 (also e < 1). 

The Hamiltonian, H(t,x,p,(j)) , is given by 
H(t,x,p,4>) = (|)[p{r^x - r^y} - 

+ p^2y - P2^2^ - PsP^Z’ 

and the optimal control (there is only one extremal) is determined from 



(EZO) 



max H(t,x,p,(J)) or 



max[<f>{pr^x - p^r^x - P 3 Pq^> + (1 - 4>){pr2y - P 2 r^y - P 3 Pq 2 >], 



which yields the non-singular optimal control to be given by 



4>(t) = 




0 for pr^x - ^ ” ^3^^2 



1 for pr^x - P^r^x - P 3 Pqj^ > pr^Y " "^ 2 ^ 2 ^ ” ^3^"^2 



(E21) 



From (E20) the adjoint equations for the dual variables are seen to be 



Since the Hamiltonian is a linear function of the control variable 
(() , the maximum principle does not determine the control when the 
coefficient of (f) vanishes for a finite interval of time (see p, 481 
of [6]). The part of a trajectory for which this happens is called a 
singular subarc. We determine the conditions for a singular subarc 
from [54] 



We should also note that since the terminal time is unspecified, we 
have from a transversality condition 





3H d f3H'l ran' 



(E23) 



H(t,x,p,<j)) = 0. 



(E24) 



We have from (E20) that 
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A rather lengthy computation, which makes use of both the adjoint 
equations (E22) and the state equations, yields 



dtV d<^} 






(E26) 



By (E23) and (E26) , we see that a condition for a singular subarc is 



that 




(E27) 



The singular control is determined from requiring that it keep us on 
the singular subarc. Thus, (E23) and (E26) yield (note that ^ 0 
and p 0) 



dx 



dt 



0 



or using the state equations, 



or 






r X r y 

(^r ) = ^ (1 - <j>)r . 

qi 1 qo 2 



Using the fact that we are on a singular subarc so that (E27) holds, 
we obtain the singular control as 



4) 




(E28) 



A necessary condition for the singular subarc to yield a maximum 
return is that [57] 




d2 r^' 

dt^ 9(}> 



} ^ 0 . 



(E29) 
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From (E26) we have that 



CM 


'an] 


dt^ 


[hJ 






or, using the state equations, 



dt" 



9H 






= -p{4>q2^+(l-4>)q2H-q2’^l^''‘^l’^2^^’*' “ pr 2 q 3 ^(l - 



and hence 



j 

CM 

"d| 


'an 


dt^ 


[a<|.j 



} = p(-qj^ + q2) (-q2rj^x + q^^r^y) + p(rj^)2q^x + p(r2)2q^y. 



On the singular subarc we must have (E27), so that the above reduces to 



CM 


'an' 


dt 2 


[a<t.J 



} = p{(rj^)2q^x + (r2)^qj^y} > 0, 



(E30) 



and the necessary condition is satisfied. 

It is convenient to define (where t is backwards time defined 
by T = T - t) 



A(t) = pr^x - Pj^r^x - P3pqj^> 



and 



B(x) = pr^y - P2^2^ “ P3P‘^2' 



(E31) 



Then (E21) may be written as 



4)(t) = 



1 for A(t) > B(t) 



0 for A(t) < B(t), 



(E32) 
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with the singular control 
Also 



r + r 
1 2 



for A(t) = B(x), 



(E33) 



di 



dt 



— (-pr^x + + P 3 Pq^^) , 



and a laborious computation, which makes use of both the adjoint 
equations (E22) and the state equations, yields 



^ . p(l - 



V '2>' 






(E34) 



Similarly , 






" 2 ^ 






(E35) 



We develop the solution by working backwards from the end of the 
problem at t = T . We start by determining the boundary condition on 
p^ at the end. There are two cases to be considered: either we are 

on a singular subarc at t = T or we are not. 

If we are on singular subarc, then by transversality condition 



(E24) and condition of singular subarc 



9(j) 



= 0 , we have 






which yields by use of the boundary conditions on (E22) 

r2y(t=T) 

. I) ^ . 



(E36) 



We also note that on the singular subarc (E27) applies. 
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If we are not on singular subarc, then there are two further 
subcases: either (|)(t = T) = 1 or c|)(t = T) = 0. If c()(t = T) = 1, 
then (E20), the transversality condition (E24) , and the boundary condi- 
tions on (E22) yield 



P3(t = T) 



r^x(t = T) 



(E37) 



Since (t = T) = 1, then by (E21) and fact that = T) = p^Ct = T) 

we have 



pr^x - P3Pq^^ > ~ P3P'^2’ 



and hence 



r^x(t = T) r^yCt = T) 



(E38) 



A similar development shows that for (|)(t = T) = 0, we must have 



r^x(t = T) r^yCt = T) 
< , 



(E39) 



We now trace the optimal trajectories backwards from the end. 

From the above, we have three cases to consider. 

X r y 

Case (1) at t = T, > 

In this case by (E38) we have (|)(t = T) = 1. From (E21) and 
boundary conditions we have 

A(t = 0) > B(t = 0). 

Then up until the time of the first switch in tactics we have from 



(E34) and (E35) 
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and 



and hence 



di 



= 0 , 



dB 

d7 = P^1^2 



'V 
■ ^0 



rix 



< 0, 



A(t) = A(t = 0) > B(t = 0) > B(t). 



Thus , we have 



(|) (t) = 1 for 0 S t s; T. 
Case (2) at t = T, 



(E40) 



rfX r^y 



qi q 2 



A similar argument shows that 



4>(t) = 0 for 0 ^ t ^ T. 
Case (3) at t = T, 



(E41) 



tfX r^y 



We see that this corresponds to when the system ends up on the 



singular subarc at t = T. In this case <t) (t = T) = 






, and we 



continue (in backwards progression) to use the singular control 

(|)(t) = r^/ (r. + r^) (note that = 0 when this is used and 

z 1 z di di 

that we had A(t = 0) = B(x = 0)) until x(t) = or y(t) = y^. 
This yields three further subcases. 



Subcase (3A) 



1 0 2^0 



We 



use cj)(t) = ^2^ ^2^ from the beginning. 



^ 1^0 ^^ 2^0 



Subcase (3B) 
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Define as t such that y(t^ > 0) = y^. Then we use 

(J)(t) = 1 for 0 ^ t ^ t^. This is consistent since A(t = T - t^) = 

B(t = T - . Then up until the time of the next switch in tactics 

we have from (E34) and (E35) 



and 



and hence 



dA 

dT 



= 0 , 



dB 



(^2'^ 

~^2 






< 0, 



A(t) = A(t = T - t^) = B(t = T - t^) > B(t). 



From (E32) we see that 



(J)(t) = 1 for T - t^ ^ T <; T. (E42) 

r X r V 
10 2*^0 

Subcase (3C) 

qi q2 

A similar argument as that for Subcase (3B) with the roles 
of X and y interchanged readily shows that 



= 0 for T - t^ ^ T T. (E43) 

Note that in the above developments we have implicitly made use of the 
non-negativity of the state variables, 

4. Diminishing Returns - Prescribed Duration Use . 

We consider 



maximize 

^(t) 



p(t){())r^x + (1 - (J))r 2 y}dt with T specified, 
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subject to: — = -(})r^x, 

^ = -(1 - 

^ = -p{(J>qj^ + (1 - <(>)q2K 

x,y,p ^ 0 and 0 ^ (f) ^ 1, 
with initial conditions 

x(t = 0) = Xq, y(t = 0) = y^, p(t = 0) = 1. 

The development of the solution to this problem is similar to 
that of maximizing return for a specified risk. We have considered the 
latter problem in Section b3. above. Two main differences between these 
problems are that (1) the boundary conditions on the dual variables at 
t = T are slightly different and (2) for the present problem the total 
time is specified so that the transversality condition H(t = T,x,p,cf)) = 0 
no longer is applicable. In view of the similarities, we shall frequently 
summarize results from the previous problem which apply to this one. 

The interested reader can, of course, refer to the previous problem for 
full details. 

The Hamiltonian, H (t ,x ,p , c|)) , is given by 
H(t,x,p,(J>) = ((>[p{r^x - r^y} - P^^r^x + P 2 ^ 2 y " P 3 P^^i “ ^ 2 ^^ 

+ P^2y - P2^2^ “ 

The adjoint equations for the dual variables are the same as (E22) with 
the exception that the boundary conditions at t = T are now 

p^(t = T) = 0, p^(t = T) = 0, p^(t = T) = 0. 



(E45) 
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The non-singular control obtained by maximizing the Hamiltonian is given 
by (where, as before, t is the backwards time defined by t = T - t) 



(t) = 



1 for A(t) > B(t) 

' 0 for A(t) < B(t) , 



(E46) 



where 



A(t) = pr^x - - P3Pqj^> 



B(t) = pr^y - V>2^2^ ~ 



(E47) 



As above, it may also be shown that 

— = p(l - <D)q^q2 






"2^ 



and 



d7 ■ P*’l‘'2 



^2y r^x 



(E48) 



It is convenient for a later development to define 



D(t) = A(t) - B(t) , 



(E49) 



so that (E46) becomes 



(J)(t) = 



1 for D(t) > 0 



0 for D(r) < 0. 



(E50) 



Using (E48) and (E49) we readily obtain 



= Pqiq2 



r^y 






dx 



(E51) 
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with 



D(t = 0) = p(r^^x - r^y) , 



(E52) 



where we have made use of (E45) besides obvious definitions. 

Since the Hamiltonian is a linear function of the control variable 
(p y the maximum principle does not determine the control when the 
coefficient of (p vanishes for a finite interval of time (see p. 481 
of [6]). We recall that the part of an optimal trajectory for which 
this happens is called a singular aubarc. As in the previous problem 
on a singular subarc we have 









(E53) 



with the singular control to remain on it given by 






(E54) 



Again, it is readily verified that the necessary condition for the 
singular subarc to yield a maximum return [57] is met. 

Let us now examine the determination of the optimal control at 
the end of the problem t = T or T = 0. Substituting the boundary 
conditions (E45) into (E47), we obtain 



A(x = 0) = pr^x, 

and 

B(t = 0) = pr^y. 



and hence (E46) becomes 



<p(t = T) 



1 for r^x(T) > r^y(T) 

0 for r^x(T) < r^y(T). 



(E55) 



(E56) 
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In contrasting the optimal trajectories and tracing the optimal 

course of the bomber utilization (backwards from the end of the prescribed 

duration period of usage) it is convenient to consider the following. 

We recall that the optimal control is determined by the sign of D(x) 

(see (E50) , (E49), and (E47)). From (E53) a singular subarc must occur 

on the line L defined by = . We recall that at the end of 

qi q2 

the planning horizon x = 0, we have 

D(t = 0) = p(t = T){r^x(t = T) - r^yCt = T) } . 

Consider now the line L' defined by r^x = ^ 2 ^’ This line will lie 
above, on, or below the line L defined by = depending 

q2 

on whether is greater than, equal to, or less than q^. This is 

evident from considering the slopes of these two lines which pass through 
the origin 





11 

• 


iz' 


dx 

\ > 


L ^1 ^^2 ’ 


dx^ 







and hence, for example. 



dZ 


> 


iz 


dx^ 


L' 


dx 



for > q^. 



The significance of the line L' and its relationship to the line L 
is that 

^ > 0 below L* 

D(x = 0) I 



< 0 above L' , 



(E57) 
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and hence by (E50) we find that 



i(t = T) = 



dD(x) 

dx 



for P(T) 


below L’ 




for P(T) 


above L’ , 


(E58) 


= T)). We 


also note from (E51) 


that 


> 0 below 


L 




< 0 above 


L. 


(E59) 



Thus, (E59) and (E59) give us three cases to consider 



Case (a) = q. 

Case (b) q^^ > 

Case (c) q^ < q^- 



For Case (a): q^^ = q 2 = q» equation (E51) and initial condition 

(E52) are 

dD . . 

— = pq(r^x - r^y) 

with 

D(t = 0) = - r^y). 



There are three cases to consider depending on the sign of D(x = 0). 
Case (1) r^x(t = T) = r ^ y(t = T) 

We see that this corresponds to when the system ends up on the 



singular subarc, i.e. , D(x = 0) = 0. In this case (J)(t = T) = 



r + r * 
2 



and we continue (in backwards progression) to use the singular control 

r^y 

<j)(t) = r^/ (r + r^) to remain on = 

2. L Z 



(note that this makes 
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— = 0 and that we had D(x = 0) = 0) until x(t) = or y(t) = y^. 

This yields three further subcases. 

Subcase (lA) r^x ^ < r ^y^ 

Define t^ as t such that x(t^ > 0) “ Then we use 

c()(t) = 1 for 0 ^ t ^ t^. This is consistent by the following. At 
T = T - t^, we have D(t = T - t^) = 0 and up until the time of 

the next switch in tactics we have 

§ = pq(Vo ■ ^ 

for T - t^ ^ T ^ T and hence 

0 = D(t = T - t^) > D(t). 

From (E50) we see that 

(|)(t) = 0 for T - t^ ^ T ^ T. (E61) 

Subcase (IB) r^^x ^ > 

A similar argument as that for Subcase (lA) with the roles 
of X and y interchanged readily shows that 

(J)(t) = 1 for T - t^ ^ T ^ T. (E62) 

Subcase (1C) r^^x ^ = r ^y^ 

We use (j)(t) = ^2^ from the beginning. 

Case (2) r ^ x(t = T) < r^y(t = T) 

In this case we have D(t = 0) = p{r^x(t = T) - ^ 2 ^^^ = T)} < 0, 
and by (E50) at the end of the planning horizon we have (J) (t = 0) = 0 
so that y(x = 0) < y(x) for x > 0. Thus we have until the time x^ 
of the first switch in tactics 
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for 



0 ^ T ^ 



— = pq{r^x(t = T) ~ r^y(T)} < 0, 
and hence 



0 > D(t = 0) > D(t). 



From (E50) we see that 



<|)(t) = 0 for 0 ^ t ^ T* (E63) 

Case(3) r ^ x(t = T) > r ^ y(t = T) 

A similar argument as that for Case (2) with the roles of x and 
y interchanged readily shows that 

(f)(t) = 1 for 0 ^ t 5 ; T* (E64) 

We now consider Case (b) : > q^* There are two cases to be 

considered • 

Case (1) never on singular subarc for finite interval of time 

Again there are two subcases to consider, depending on whether 
the system winds up above or below L. 

r^x(t = T) r^yCt = T) 

Subcase (la) > 

qi ^2 

The definitions of Case (b) and Subcase (la) imply 

r^x(t = T) 
r2y(t = T) ^ ^ 

SO that we have 



r^x(T = 0) > r^y(,T = 0) . 
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Thus by (E52) D(t = 0) >0 and hence by (E50) (J)(t = T) = 1, We 

consider now the x-time interval up until the time of the first 

switch in tactics. Use of cf)(T) = 1 for ie[0,T^] results in x(x) > 
x(t = 0) for T > 0. Recalling that 



dx 



pqiq2 







for Te[0,Tj^] and the definition of this case, we easily see that 

4^ > 0 and hence 
dx 

0 < D(x = 0) < D(x). 



From (E50) we see that 



(() (t) = 1 for 0 ^ t ^ T, 
Subcase (lb) 



(E65) 



r^x(t = T) r^yCt = T) 

< 



Again there are two further subcases to consider, depending 
on whether the system winds up above or below L^. 

r x(t = T) r y(t = T) 

Subcase (Ibl) < and r^x(t = T) < 

-1 

r ^y(t = T) 

In this case we wind up above L*, Since D(x = 0) is given 
by (E52) , we have D(x = 0) <0 and hence by (E50) cf)(x = 0) = 0, Since 
we are initially above L and remain so by use of (J)(x) = 0, we have 
by (E59) ^ < 0 for all xe[0,T] and hence D(x) < 0 for all x. 

Thus we have 



(|)(t) = 0 for 0 ^ t ^ T, 



(E66) 
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r x(t = T) r y(t = T) 

Subcase (Ibll) and r,x(t = T) 

q q -l-v ^ 



r. y(t = T) 

In this case we wind up below L' at the end. Since 

D(t = 0) is given by (E52) , we have D(x = 0) > 0 and hence by (E50) 

(()(t = 0) = 1. We work backwards from the end. Since we are above L, 
dD 



dx 



< 0 while we remain above L. Thus D(x) decreases for x>0 while 



we remain above L. There are two further subcases depending on whether 

D(x) decreases to zero before the line L is encountered. Let x^ 

be such that D(x^) =0. If L has not yet been reached at x^, then 

D(x) for X > x^ is negative and <))(x) = 0 until the beginning of 

battle. It is also possible that the system just reaches L the instant 

that D(x^) = 0. In this case (assuming we don’t remain on singular subarc) 

D(x) > 0 for X > x^, since we pass below L and then > 0. 

Case (2) on singular subarc for finite interval of time 

r^x(t = T) r^y(t = T) 

This can happen only when < and r^x(t = T) > 

qi q2 1 

r^yCt = T). As usual, we work backwards from the end of the planning 

horizon. We use c()(x) = 1 for 0 ^ x ^ x^, and at x = x^ we must 
r^x(x^) 

have = . We use the singular control (|)(x) = r / (r- + r ) 

^2 z 1 z 

for T ^ ^ T ^ T There are three further subcases 



(1) 


xCTz) = Xq , 


y (^ 2 ) 




(2) 


x(t2) ^0 ’ 


y(f2) 


= y, 


(3) 


x(t2) = Xq , 


y(T2> 


= y, 



We omit the trivial discussion of these cases. 
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Thus, to summarize, we see that there are six possible cases for 
the history of the strategic worth of the two target areas in the use 
of the bomber for a prescribed length of time: 



(1) 


started below 


L 


and never reached L, 


(2) 


always above 


L' , 




(3) 


started above 


L' 


and end up above L but below L* 




without ever reaching L, 


(A) 


end up above 


L but started below L and did not remain 




on L for finite 


interval of time. 


(5) 


started above 


(or 


on) L and were on L for finite 




interval of time. 




(6) 


started below 


L 


and were on L for finite interval of 




time . 






Case 


(c) : q, < 


is 


similar to Case (b). 



c. Summary of Solutions . 

In this section we summarize the solutions developed in the 
previous section for the four versions of the continuous stochastic 
gold-mining problem. We shall summarize the cases of non-diminishing 
and diminishing returns separately. 

The solution for the case of non-diminishing returns is shown in 
Table El. We note that for both cases considered the optimal policy 
is independent of the current strategic values of the two target areas, 
i.e., the state variables. For the case of maximizing the return for 
a specified risk, the optimal policy is independent of the risk (cumula- 
tive probability of bomber being shot down) and depends only on the 
r . 

ratios of — which we may interpret as the^ expected gain per unit 
^i 

time divided by the expected loss per unit time. 



Table El. Solution to Continuous Stochastic Gold-Mining Problem 
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Note: is given by 
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For the case of prescribed duration use with non-diminishing 
returns, we consider the case of with the other case being 

similar with the roles of x and y interchanged. The condition 
^2 ^ ^1 that there is a larger risk per unit time of the bomber 

being lost over the second target area. Consider the planning horizon 
of length T. During the closing stages of length of this bombing 

campaign, we send the bomber to the target area of greater return per 
unit time regardless of the risk. The length of this interval, x^, 
is, of course, dependent on the risks involved and will be shorter as 
the chances of the bomber being shot down over target area two become 
greater. During the initial stages of the bombing campaing, i.e., for 
0 ^ t ^ T - T^, we allocate the bomber giving consideration to the 
risks, and the solution is identical to the previous case. 

When there are diminishing returns, the solution is seen to 
depend on the strategic values of the target areas. Consequently, we 
have chosen to plot the optimal policies as a function of the state 
variables . 

The case of maximizing return for a specified risk with diminish- 
ing returns is shown in Figure El. It is seen that the line L defined 
rix r^y 

by = plays a central role in the solution. We may interpret 

^1 ^2 r^x 

a quotient like as representing the expected return per unit time 

divided by the expected loss per unit time for operating in the target 
area. Another way to do this is return per unit cost per unit time. 

The optimal policy is to send the bomber to the target area which 
maximizes the return per unit risk (cost). In this respect this solu- 
tion is identical to that of non-diminishing returns except now, of course. 
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Figure El. Solution to Stochastic Gold-Mining Problem with Diminsihing Returns 
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the expected return per unit time depends on the strategic value of 
the target area. The paths labelled on Figure El correspond to the 
nomenclature of Section b3. above. We note that this solution is the 
same as that for prescribed duration use when i.e., there 

is equal risk of losing the bomber in the two target areas. 

For the case of prescribed duration use with diminishing returns 
there are three cases to consider. The solution for Case (a) : ” ^2 

is the same as that for maximizing return for specified risk as discussed 
above. The case when q^ > q^ is shown in Figure E2. The paths are 
denoted according to our terminology of Section b4. Again, consider 
the total time of the bombing campaign. During the early stages we 
allocate giving consideration to risks, but during the closing stages, 
the bomber is sent to the target area yielding the greater return per 
unit time (as measured by r^x and r^y) regardless of risk. Although 
we have not made an explicit determination, it seems reasonable to 
conjecture by analogy with the case of non-diminishing returns that 
the greater the risk at target area one, the shorter this interval will 
be. During the previous period, i.e., 0 ^ t T - t^, the bomber is 
allocated on the basis of return per unit cost as before. 

d. Discussion . 

We have already noted for the non-diminishing returns the alloca- 
tion is independent of the state variables and effort is concentrated 
on one alternative, whereas for diminishing returns the values of the 
state variables must be considered and effort may be split over the 
alternatives. We shall point out some similarities with the combat 
allocation models of Appendix C and then attempt some generalizations. 



Case (b) 
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Figure E2, Solution to Stochastic Gold-Mining Problem with Diminishing Returns 
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We should note the similarity of the structure of the optimal 
allocation policies with that in selection of target type in combat 
described by Lanchester-type equations. There appears to be an under- 
lying structure for allocation with diminishing returns and allocation 
with non-diminishing returns. Let us recall that for a square law 
attrition process, the attrition (return) per unit time per unit of 
weapon system is a constant; whereas for a linear law attrition process, 
the attrition (return) per unit time per unit of weapon system is 
proportional to the number of targets remaining (diminishing returns). 
This observation has prompted our conclusion in Appendix C that fire 
is concentrated on a single target type only when the fire is "aimed” 
and the target acquisition rate is not subject to diminishing returns. 

We also note that the termination conditions of the scenario 
(prescribed time or use until reach given level of risk) has an effect 
upon the optimal allocation policy. We have noted in Appendix C a 
similar result for tactical allocation in combat described by Lanchester- 
type equations. 

When we compare the results from the Lanchester attrition models 
to the stochastic gold-mining problems, the allocation appears to be 
different when one is not subject to a cost (loss) from the alternative 
not being used. It seems appropriate to consider in future work this 
type of attrition model to see what insight may be provided. 

We seem to have uncovered a general principle (although we most 
likely are not the first) that allocation in the face of non-diminishing 
returns and diminishing returns are two fundamentally different cases. 
With diminishing returns, we must constantly observe the state of our 



system. 
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APPENDIX F, A New Dynamic Kill Potential, 

In this appendix we propose a dynamic measure of combat capability 
by means of the adjoint system of differential equations for Lanchester- 
type equations of combat. The current results are of a preliminary 
nature and may be revised in the future. 

What is a quantitative measure of effectiveness for a combat unit 
or weapon system? In many circumstances it appears to be the rate of 
destruction of the enemy. A more sophisticated approach is to consider 
the rate of destruction of enemy capability as measured by the rate of 
destruction of his kill rate against the friendlies. 

We have devised a simple way to determine a dynamic kill potential 
which is the rate of destruction of enemy kill rate giving full consid- 
eration to the future course of combat. Consider a weapon system of 
constant kill rate capability employed in combat against an enemy. 

The loss of such a weapon is weighted more heavily in the early stages 
than in later ones. This is because of the "multiplying effect" of the 
dynamics of combat, i.e., loss of a weapon is also loss of future 
killing capability of the weapon. 

Such a concept has application to force structuring and weapon 
system analysis. In such work, frequently a large number of alternatives 
have to be screened. It is infeasible to assess the effectiveness for 
all the alternate force/weapons mixes by a computer simulation of a 
standardized scenario. The concept of firepower scores and weapon 
firepower potential have been developed to screen out unattractive 
alternatives in preliminary analyses. We have extended these concepts 
to consider the true dynamics of combat. Originally we were motivated 
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by the interpretation of the adjoint system of differential equations 
in optimal control theory. 

In this appendix we state the problem, give some additional back- 
ground, and then propose our solution. We then comment on other 
applications of these ideas before presenting a brief justification of 
our concept. Finally, we point out the deep relationship of this seem- 
ingly simple notion to linear analysis. 

This is our initial effort on this problem from a purely mathe- 
matical point of view. For the future, we would propose to compare 
firepower potentials computed by current methods and by our new method 
and also to improve and expand the exposition. We are currently super- 
vising a student thesis on this topic from a more applied standpoint 
("Weapon Firepower Potential” by Major James B. Taylor, USA). 

a. Statement of the problem . 

To devise a quantitative measure of the combat capability of a 
unit/weapon system giving consideration to the dyanmics of combat. 

b . Some Background . 

We could consider a "static" kill potential, the rate of destruc- 
tion of the enemy kill rate against the friendlies not considering the 
future course of battle. The concept of firepower scores has evolved 
into the notion of weapon firepower potential. The latter considers 
attrition rates as we have indicated but in a "static" fashion. In 
practice, analysts use operational ammunition consumption rates and 
operational kill/hit probabilities to estimate attrition rates. Infor- 
mation systems have been designed to make available such information 
on various systems in numerous circumstances. A high degree of sophistication 
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is not warranted for estimation of kill rates because of the uncertainty 
in the data. 

The current approach to weapon firepower potential does attempt 
to consider combat dynamics in the following fashion: kill rates are 

weighted more heavily at the longer ranges. This recognizes the advan-' 
tage of destroying the enemy at longer ranges before he becomes more 
effective at killing friendlies at the closer ranges. 

What we need is a measure which considers the dynamics of combat: 
losses early in battle effect the outcome by evolving into more enemy 
survivors and less friendlies. In the next section we show how to use the 
concepts of operational definition and adjoint system of differential 
equations to account for combat dynamics. 

c. The Proposed Solution . 

We employ the concept of an operational definition (see Chapter 
5 in [1]) by defining a dynamic firepower potential of a unit/weapon 
system under precise circumstances. Numerical measures can only be 
meaningfully compared under the applicable circumstances. 

We consider a standardized scenario of combat between an X-force 
and a Y-force in a battle lasting a prescribed time T. For illustra- 
tive purposes we consider the case of constant attrition rates. Our 
approach explained in Appendix D allows many variable attrition rate 
cases to be solved in closed form. This approach applies equally well 
to the adjoint system of differential equations considered here. 

We consider the rate of return of a unit/weapon system (in terms 
of destruction of enemy kill rate) as measured by the product of a 
measure of enemy kill-rate worth and the enemy attrition rate by the 
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friendlies. In many circumstances these quantities will have to be 
properly weighted averages. There is also the problem of combat 
between heterogeneous forces. Such considerations are beyond the scope 
of our simple illustrative example. 

We define the dynamic firepower potential, F.P., as 



F.P. = ap^, (FI) 

where 

a is the rate of attrition achieved by the unit/weapon system, and 
p^ is the unit worth of enemy forces as measured by the rate of 
change of the value of engagement in a standardized scenario. 

An average firepower potential would be given by 



F.P. 



T 



1 . 

T 



a(t)p^ (t)dt . 



0 



(F2) 



We shall see that p^(t) is a variable dual to the state variables, 

X and y, which describe the course of combat as a sequence of points 
for average force strength. 

We consider now a battle lasting from t = 0 until t = T with 
the combat described by 

dx 

dF = 

= -bx, (F3) 



which we may write as 



dt 




(F4) 
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where X is a column vector of average force strengths, i.e., X = . 

The adjoint system of differential equations for (F4) is 

rO 

dt ' 

P 



= ‘'/p 

0'' ’ 



(F5) 



where P = . 

What is our motivation for considering the adjoint system of differ- 
ential equations? The transposed system of equations has long been used 
to study the consistency (solvability) of a system of linear equations. 

If we were to use finite differences to approximate the Lanches ter- type 
equations (F3) , we would obtain a system of linear equations. Forming 
the transposed system and passing to the limit, we obtain the adjoint 
system. Usually, one develops the adjoint system by integrating by parts, 

but we feel that these considerations here provide more insight. 

We may also write (F5) as 

dPi 



dT ■ '•P2- 



dp, 

dT 



aPi 



(F6) 



Let us now multiply the first of (F3) by p^ , the second by 
and add to obtain 



Pi ^ + Po = Pi (-ay) + Po (-bx) 



2 dt 



Similarly for (F6) 



dp, dp 



Hence 



Pi dF ^2 ^ ^ ^ 



dp, dp, 



+ >' dT ■ “ ■ dJ 



d / 

-(xp^ + yp2 



), 



or 
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• ?) = 0 , 

and hence 

X(t) • P(t) = const. (F7) 

We may interpret this last condition as a compatability require- 
ment which implies that if initial conditions are given for X, then 
the only appropriate boundary condition for P is at t = T. Hence, 
we specify the following conditions for (F6) 

p^(t = T) = A , p^Ct = T) = B, (F8) 

and thus, letting x = T - t, the solution to (F6) and (F8) is given 
by 

p^ (x) = A cosh/a^ X - sinh /ab^ x, 

-L a 

and 

P^Ct) = B cosh/^ T - sinh t. (F9) 

Let us call V the value of engagement given by 

V = x(T)p^(T) + yCDp^d) = x(t)p^(t) + y(t)p 2 (t). (FIO) 

Hence we see that 

Pi<t) (t). 

and 



P2(t) =f (t). 



(Fll) 
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We call Pj^>P 2 variables, and they determine the combat’s tra- 

jectory in terms of line coordinates, whereas the state variables, x 
and y, determine it in terms of point coordinates. 

We have noted in dynamic tactical allocation models that if 
surviving forces at t = T are assigned a worth proportional to their 
kill rate, then target selection depends on the product of kill rates 
(target and f irer) . This has influenced our definition of dynamic kill 
potential . 

d . Some Comments . 

The above is the same approach used by G. Bliss in developing 
range tables for correcting artillery fire due to abnormal air densities, 
weights of projectiles, winds, etc., shortly after World War I [17], [67]. 

We may think of the p’s (dual variables) as the line coordinates of 
the trajectory (path) of the battle represented by (F3), i.e. , x = x(t) 
and y = y(t) (the solution to (F3)) defines a curve in the x,y space. 

The duality of Euclidean geometry (after adding the ideal point at infinity) 
states that we may equally well represent a curve as either a sequence 
of points (point coordinates) or as an envelope of tangents (line coordi- 
nates). When points are transformed by a linear transformation, the 
line coordinates are transformed by the transposed (or dual) matrix of 
this transformation. Let us note that we may consider a linear differ- 
ential equation to be the limit of linear equations. 

e. Justification . 

We may use the condition X • P = const. to develop justification 
for calling p^ the rate of change of the value of the engagement with 



respect to X forces. 



3x’ 



Consider a battle lasting a specified 
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length of time T. Hence, we have 

x(t)p^(t) + y(t)p 2 (t) = x(T)p^(T) + y(T)p 2 (T). (F12) 

If at time t the X commander had Ax(t) less troops, then this 
would cause him to have less surviving troops at the end of battle 
and the enemy (Y) to have more. In fact, the p’s tell us how much 
as we see below 

(x(t) - Ax(t))p^(t) + y(t)p 2 (t) = (x(T) - Ax(T))p^(T) + (y(T) + Ay(T))p 2 (T), 

Combining (F12) and (F13) , we obtain 

Ax(t)p^(t) = Ax(T)p^(T) - Ay(T)p 2 (T). 

Letting p^(T) = 1 and we see why I have referred to the 

p’s as the value of forces 

Ax(t)p^(t) = Ax(T) + Ay(T). (F14) 

From the above, we see that the variable p^(t) shows what the effect 
of the loss of one X soldier at time t would have on the outcome 
of battle. Expressing the value of engagement, V, in terms of survivors, 
we see that 

Pj^(t) = — (t) and p^Ct) = — (t). 

Bliss's idea for the development of air density corrections for 
the artillery range tables was similar. 



(F13) 



f. Relation to Other Mathematics. 



The underlying mathematical structure considered here (duality) 
manifests itself in many of the modern operations research optimization 
tools. Let us recall that we showed 



for 



dX 



and 



^ 



we must have 

—y ~y 

X • P = const. 



(F15) 



The finite dimensional analogue of this relationship is 



for 



-> -y 
Ax = b 



and 



T-> 

A y = c, 



we must have 

-y -y -y -y , ^ 

y • b = c • X. (F16) 

When extended to non-negative variables, this is 

-y -y T-> -> 

for Ax = b and A y ^ c, 

X ^ 0 



we must have 

-y -y -y -y y— - _ . 

y • b ^ c • X. (F17) 

The latter relationship may be used to develop many results in the 

theory of linear programming. For example, an immediate consequence 

~y -y -y ~y ~y i r\ 

is that for x that maximizes c • x subject to Ax = b and x ^ 0, 

a sufficient condition is given by 
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where B is non-singular matrix such that Bx = b and x is vector 

D D 

of non-zero components of the solution. The above condition is 
expressed in the linear programming literature as - c^ ^ 0. 

To further indicate the fundamental nature of these concepts, we 
note that a further generalization of (F15) is 

>v 

for Lu(x) = f(x) and L v(x) = g(x), 

we must have 

{v(x)Lu(x) - u(x)L v(x)}dx = boundary terms, (F18) 

; 

-k 

where L is a linear differential operator and L is its adjoint. 
This is known as Green’s identity (p . 183 [62]) and has many important 
applications to ordinary and partial differential equations. From it 
one obtains the Green’s functions for constructing solutions. 
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APPENDIX G. Applications to Deterministic Inventory Theory 



In this section we consider the optimization of continuous review 
deterministic inventory models by the Pontryagin Maximum Principle. 
Several previously published results are extended. For linear produc- 
tion rate costs, we show that when demand is known with certainty and 
stock may be reordered at any point (continuously) in time, the optimal 
inventory policy is to only order as needed and only do this after the 
initial inventory has been depleted. The same type of policy is true 
when there are budgetary constraints with the constraint being ignored 
until the budget has been expended. We also have developed an alter- 
nate method of analysis to that developed by Arrow and Karlin [3] for 
the case of convex production rate costs. Our results on this latter 
topic are not fully documented at this time. 

Our reasons for considering inventory problems are twofold: 

(1) such problems are a major aspect of defense planning and (2) our 
previous research has considered operations research models with a simi- 
lar mathematical structure. Our past research has uncovered several 
facets of formulating and solving such dynamic models. For example, 
by application of the theory of singular control [53], [54], [57], we 
have shown that when the production cost rate function is linear, the 
optimal inventory policy is insensitive to the nature of the shortage 
(or penalty) cost function (as long as this is not pathological) . 

Our organization of this section is as follows: we review the 

general deterministic inventory model and the shortcomings of the 
classical calculus of variations methods for such a model before we 
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consider our sequence of- models. Then, we discuss the insight that we 
have gained into optimal inventory policies. We begin by surveying 
some previous work in the field of deterministic inventory theory. 

An excellent introduction to elementary inventory theory and in- 
ventory theory in general prior to 1957 is to be found in [26]. Dy- 
namic models were not considered prior to 1951. A more advanced in- 
troduction to inventory theory is by Arrow, Karlin, and Scarf [4], 
who summarize work through 1958 and give an extensive bibliography. 

Variational methods were applied to a deterministic inventory process by Arrow 
and Karlin [3] in this work. An excellent survey of modelling tech- 
niques and results has been written by Karlin [56] . Adiri and Ben- 
Israel [2] attempted to extend the work of Arrow and Karlin by use 
of the Pontryagin maximum principle. A comprehensive bibliography of 
applications of optimal control theory to operations research problems 
has been published by Tracz [77]. Considering this last reference, it 
appears as though the above work and references cited therein represents 
most of the published results on dynamic, deterministic inventory models. 
Recently McMasters [63] has studied the Arrow and Karlin problem. How- 
ever, we obtain here different results than McMasters has. Our results 
are more in consonance with those of Arrow and Karlin [3]. 

a. The General Model . 

We consider a deterministic inventory process subject to continu- 
ous review. Karlin has an excellent discussion and classification of 
inventory .models and our present discussion has been based on his [56]. 

We consider that all processes occur continuously in time. We shall 
see that this leads to a problem in the calculus of variations. How- 
ever, two factors that are commonly present in applications preclude 
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the direct application of the classical calculus of variations results 

(1) non-negativity of variables and (2) inequality constraints. 

Karlin [56] identifies four main factors in the inventory process 

(1) cost factors, 

(2) nature of demand for inventory, 

(3) nature of supply for inventory, 

(4) mechanism of inventory process. 

We assume a single item inventory. We consider a production cost, 
c(u(t)) , per unit time which only depends upon the rate of production 
u(t). We also consider storage or holding cost, h(l(t)) , which de- 
pend upon the inventory level I(t). Orginally, h(I(t)) is only de- 
fined for I(t) ^ 0 , but we may extend this to I(t) <0 by con- 

sidering shortage or penalty costs for not meeting inventory demand. 

We omit considerations of the ’’time value of money*’ (discount rate) . 

The nature of the inventory demand is assumed to be perfectly 
known and is given by r(t) , which is the demand rate. We consider 
a deterministic supply without setup costs. The production rate is 
denoted by u(t) . We consider an inventory process without lags and 
continuous in time. Our decision criterion is the minimization of 
total cost. The basic type of model we consider is the minimization 
of a cost functional. 



J[u] 



I^T 

[c(u(t)) +h(I.(t))]dt , T specified, 

Jo 



with the inventory being given by 
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I(t) = 1(0) + 



t 

'o 



[u(t) - r(t)]dt. 



The production rate is, of course, restricted to non-negative, i,e., 
u(t) 0 

b . Shortcomings of the Classical Calculus of Variations . 

We have already noted two model factors that prevent direct appli- 
cation of classical calculus of variations results: (1) non-negative 

variables and (2) inequality constraints. Our own research, however, 
indicates that these difficulties may overcome by the formulation of 
an equivalent problem. A similar approach may be used to develope many 
non-linear programming results by the calculus [59] . For example, when 

there are non-negative variables in our orginal problem, we may formu- 

2 

late an equivalent problem by replacing x by u .We solve this 
equivalent problem for u and then recover our orginal variable x. 

Inequality constraints are easily converted to equality constraints by 

the addition of non-negative slack variables. 

c . Comments on Previous Work . 

Our general comments are than when variational methods were at- 
tempted before the advent of the Pontryagin maximum principle, little 
more than a first variation approach leading to an Euler-Lagrange 
equation was employed. We should note that the Pontryagin maximum 
principle involves both the Euler-Lagrange equations and the Weierstrass 
condition for the Weierstrass excess function. It is not surprising 
that use of but one calculus of variations’ tool from among many (there 
are four well-known necessary conditions, i.e., Euler equation, Weierstrass 
Legendre (second order), and Jacobi conditions) has not been able to solve 
all problems. 

F. Morin [64] appears to be one of the first economists to formu- 
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late and attempt to solve a deterministic inventory model with con- 
tinuous time. No backlogging of orders was allowed (no stockouts). 

It should be noted that Morin tried to apply some theory developed 
by Bolza (see [18] pp. 41-43) for extremal curves on the boundary of 
the state space. 

Arrow and Karlin [3] have solved Morin^s problem. Whereas Morin 
tried to apply Bolza’ s results directly to his problem, Arrow and 
Karlin develop the solution to this specific problem by variational 
methods. Anyone doubting the complexities of applying variational 
methods to problems with non-negative variables and inequalities 
should consult this work. In our notation the Arrow-Karlin problem 
was 



f ^ 

min [c(u(t)) + h(I(t))]dt with T specified, 

u(t) J 0 

subject to: ^ ^ ^ 

dt 

and u(t) ^0 , I(t) ^0 

with boundary conditions 

I(t = 0) = 1(0) and I(t = T) = 0 . (Gl) 

Arrow and Karlin [3] solve the above model for linear holding rate costs 
and general convex production rate costs. Their general solution algorithm 
is applied to linear production rate costs and several other examples, 
including quadratic production costs. The theoretical foundations of 
Arrow and Karlin’s analysis are not immediately evident from the con- 
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tent of their paper which merely summarizes the results. The central 
point is that one-sided variations are required when the inventory is 
at a zero level. Arrow and Karlin apparently developed an extension 
of the usual variational development for problems where convexity prop- 
erties can be assumed. Their approach, however, does not seem to be 
documented in any of the mathematical literature known to this author. 

Adiri and Ben-Israel [2] applied to the Pontryagin maximum princi- 
ple to Arrow and Karlin's problem besides the classical optimal lot 
size problem. However, because the boundary condition I(t = T) = 0 , 

the value of the dual variable p(t) = /9l) (t) is free at t = T 

Since they never determine the value of the dual variable at t = T , 
i.e., p(t = T) , they never do solve this problem. In fact, their 
conclusion as to the solution for linear production costs is unsupport- 
ed by their analysis (the conclusion that the partial derivative of 
the Hamiltonian with respect to the control variable is always nega- 
tive is unsupported). 

We re-examine the solution to the Arrow-Karlin problem given by 
(Gl) above. The constraint; on the state variable I(t) ^0 implies 
that we must have dl/dt ^ 0 when I(t) = 0 . Hence, we have 



We must further check to see if the state variable constraint has an 




^ 0 for I(t) > 0 



u(t) 



^ r(t) for I(t) = 0 . 



(G2) 



effect on the adjoint equation (see [24] p. 117), but we see that it 
does not since (3/31) {dl/dt} = 0 . The Hamiltonian is given by 
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H(t,I,p,u) = c(u(t)) + h(I(t)) + p(t){u(t) - r(t)}, 



SO that the extremal control is given by 



min {c(u(t)) + p(t)u(t)}. (G3) 

u(t) 

We note that p(t) > 0 implies that the minimum of (G3) is given by 
the minimum u(t) given by (G2) . The adjoint equation for the dual 
variable p(t) = (3J /9I)(t) (see [12] for this interpretation) is 
given by 

dt ” 31 dl * 

We introduce the backwards time t = T - t so that dp/di = dh/dl and 
hence 



P(x) = 



0 



dx + p(x=0) . 



Because of the constraint I(t) ^ 0 for all time, it is necessary to 
consider two separate cases at t = 0. When I(t=T) > 0, then 
p(t=0) = 0. This generates a further condition on l(t=0) so that 
the end state I(t=T) > 0 may be reached. When I(t=T) =0, it may 
be shown that p(x=0) must be <0. The precise value of p(t=0) is 
determined by further simultaneous conditions. 

McMasters [63] also considers the above models. Unlike Arrow and 
Karlin [3] who assumed that I(t=T) =0, he makes no assumption about 
the inventory level at the end of the planning period. He does not 
distinguish between the two cases that we have above ((1) I(t=T) > 0 
and (2) I(t=T) = 0) and consequently derives different results. He 
also considered the problem when shortages (stockouts) are allowed. He 
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solves this problem for linear production and holding costs but does 
not recognize the singular solution [53] in his model. We show in the 
present work that more general results are possible, i.e., if production 
costs are linear, then the optimal inventory policy is relatively insen- 
sitive to the nature of holding and shortage costs as long as (dh/dl) > 0 
for I > 0 and (dh/dl) <0 for I < 0. 
d . A Sequence of Models . 

In this section we consider a sequence of Arrow-Karlin type models: 
no stockouts, stockouts allowed with linear production costs, and budget 
constraints. We have also considered a model where there is a special 
penalty cost for being out of inventory at the end of the planning period 
in the stockouts allowed case. This was prompted by the disturbing fea- 
ture of the developing a shortage at the end of the planning period 
turning out to be the optimal policy in the stockout model. This is 
related to future demand being known with certainty. Neither the model 
nor its policy apply in many real-world circumstances. 

No Stockouts 
We consider the problem 

rT 

min [c(u(t)) + h(I(t))]dt with T specified, 

u(t) ^0 

subject to: ^ = u(t) - r(t) , 

and u(t) ^0, I(t) ^0 
with initial condition 



l(t=0) = 1(0). 



(G4) 
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We assume that holding costs are a non-decreasing function of the inven- 
tory level, i.e,, (dh/dl) ^ 0. As above, the constraint on the state 
variable I(t) ^ 0 implies that we must have (dl/dt) ^ 0 when I(t) = 0 
so that (G2) applies. It is easily checked that this last condition 
does not modify the adjoint equation (see [24] p. 117). The Hamiltonian 
is given by 



H(t,I,p,u) = c(u(t)) + h(I(t)) + p{u(t) - r(t)}, (G5) 



so that the optimal control (there is only one extremal) is given by 



min {c(u(t)) + p(t)u(t)}, (G6) 

u(t) 

where u(t) must satisfy (G2) . The adjoint equation for the dual variable 
is given by 



dt 31 dl ■ 



(G7) 



There are two cases to consider for the boundary condition on the dual 
variable at t = T, depending on whether I(t) >0 or I(t) = 0. 



Case A. I(T) > 0. 

In this case p(t=T) = 0, since there is no terminal payoff (we 
have the problem of Lagrange in the classical literature). We introduce 
the backward time t = T - t so that (dp/dr) = -(dp/dt) and hence 



/•T 

-'0 




^ 0 



for all 



T ^ 0. 



p(t) = 



(G8) 
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Since we assume the production costs to be non-decreasing, (G6) immediately 
yields the optimal inventory policy 

{ 0 for I(t) > 0 

r(t) for I(t) = 0. 

Now since I(T) > 0, then u (T) =0. By a continuity argument, it is 

•k 

easy to show that u (t) =0 in a neighborhood of T, i.e., t €(T-6,T] 
for 6 > 0. From the state equation of (Gl) , we have 



I(t) 



{r(s) - u(s)}ds + I(t=T), 



and hence 



k 



I (t) 



rT 

r (s)ds + I (t=T) , 

^0 



so it is easy to see that I (t) > 0 for all t and hence u (t) = 0 
for all t. Thus, we require that 



1 ( 0 ) > 



r (t)dt . 

0 



Hence, we see the obvious result that you never produce if you can meet 
all future demand. 



Case B. I(T) = 0. 

In this case p(t=T) is unspecified. The nature of c(u(t)) now 
effects the structure of the optimal inventory policy. Hence we must 
consider three further subcases for production rate costs 
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(1) concave, 

(2) linear, 

(3) convex. 

In the current report we do not carry the analysis any further. We have 
completed the analysis for a quadratic production-rate cost and constant 
demand rate. We have obtained the same results in this special case as 
Arrow and Karlin [3], who used a variational approach which (to the best 
of this author’s knowledge) is found nowhere else in applied mathematics 
literature. We hope to document our complete results in a future report. 

It seems appropriate to indicate the nature of our results. In the 
cases of concave and linear production rate costs, the optimal inventory 
policy turns out to be 

/ 0 for I(t) > 0 

* ) 
u (t) = S 

^ r(t) for I(t) = 0. 

This is not surprising. In the case of convex production rate costs 
(this might be due to plant expansion or overtime to attain higher 
production rates), we have obtained Arrow and Karlin’s results. We feel 
that our approach is more general and hope to explore its capability 
further in the future. 

Stockouts Allowed 

We consider the same problem as above only we remove the constraint 
that I(t) ^0. We assume that 




> 0 for I(t) > 0 



< 0 for I(t) < 0. 
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Equations (G5) , (G6) , and (G7) are readily seen to be still applicable. 
We can no longer guarantee that p(x) ^ 0 for all x and thus (G6) no 
longer yields the optimal control by inspection. We consider 

an __ dc ^ 

9u du 

•k 

and note that u (t) = 0 for (9H/3u) >0. To proceed further we must 
make assumptions on the nature of the production costs c(u(t)) (all 
we had to assume previously was that c(u(t)) was a non-decreasing 
function of u) . Since we may also have (9H/Bu) < 0, we must further 
restrict u(t) as follows 



0 u(t) b 

We have not carried the analysis in this most general case further. The 
details appear to be messy but straightforward. Instead we specialize 
the problem. 

Stockouts Allowed - Linear Production Cost 
We consider the problem 

rl 

min [au(t) + h(I(t))]dt with T specified, 

u(t) h 

subject to: = u(t) - r(t), 

and 0 Sfc u(t) b (also a > 0) 



with initial condition 



l(t=0) = 1(0). 



(G9) 
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We make the following assumptions on the holding and penalty costs 



j 


^ > 0 


for 


Kt) 


> 0 


dh J 
di , 


;=o 


for 


Kt) 


= 0 


1 


^ < 0 


for 


I(t) 


< 0 , 



(GIO) 



and also (d^h/dl^) > 0 for I(t) = 0. Later we will see that we only 
require h(l) to have a minimum at 1 = 0 so that h(I) need not be 
twice differentiable at 1=0. 

The Hamiltonian is given by 



H(t,I,p,u) = au + h(I) + p(u-r). 



(Gil) 



and it is seen that the optimal control (there is only one extremal) is 
usually given by 



* 

u (t) 



0 for p(t) > -a 



(G12) 



^ b for p(t) < -a 

The adjoint equation for the dual variable (in backwards time t = T - t) 
is 



dp _ ^ 
dx dl 



with 



p(t=0) = 0, 



(G13) 



and hence 



P (t) 



("T 

''O 




(G14) 



If I(t=T) ^ 0, then it is easy to see by (GIO), (G12) , and (G14) 

-k 

that u (t) = 0 for 0 t ss: T. If I(t=T) < 0, then we have by (GIO) 

and (G14) that p(x) < 0 near x = 0. Also considering (G12) , we see 

k 

that u (t) = 0 for 0 t i where x^^ 



is determined by 
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and 



T 

^0 



1 




-a, 



I(t) 



fT 

r(t)dT + I(t=T) . 
'’O 



(G15) 



Since the Hamiltonian is a linear function of the control variable 
u, the minimum principle does not determine the control when the 
coefficient of u vanishes, i.e., p(t) = -a, for a finite interval 
of time (see p. 481 of [6]). Part of a trajectory for which this happens 
is called a singular subarc. We determine the conditions for a singular 
subarc from [54] 



8u 



^ r^i = 0 

dt ''3U'' 



(G16) 



We have from (Gll) that 



and 



du 



a + p, 



_d_ rM) = « ^ 

dt di‘ 



(G17) 



Hence on a singular subarc we have 



p(x) = -a 



and 



dl 



= 0 . 



(G18) 



The latter of these implies that I(t) =0 on a singular subarc. From 



(G15) we see that we reach the singular subarc at t = x^. We stay on 
it until we have to get off to meet the given initial condition 1(0). 
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We stay on the singular subarc by using u (t) = r(t), which keeps 
I(t) equal to zero. 

A necessary condition for a singular subarc to yield a minimum 
return is that [57] 



From (G18) we have that 



dt 



d flU') = _A f_ 

dt dP 



d^h ^ 
dl^ dt 



d^h 



dl^ 



(u-r) , 



and hence 



9u dt^ dl^ 



(G20) 



Our assumption that d^h/dl^ > 0 for 1=0 guarantees that (G19) is 
met. Hence, when the holding-shortage cost curve has a minimum at 1=0, 
i.e., dh/dl = 0 and d^h/dl^ > 0, we may have an optimal singular 
solution holding the inventory at zero. By a limiting argument we may 
dispense with the condition that d^h/dl^ > 0 and only require that 
h(I) has a minimum at 1=0. 

To summarize, the optimal inventory policy is given by 



{ 0 for I(t) > 0 

r(t) for I(t) = 0 

b for I(t) < 0 for t 6[0 ,T-t^], 

and 

u"^(t) = 0 for t 6 (T-t^,T], (G21) 

where 



is determined by (G15) . 



185 



Budget Constraints - Product Costs Only 

We consider the same model as immediately above only we assume that 
there is a budget constraint on production, i.e., we must have 

rT 

c(u(t))dt A, 

^0 



where A is the total production budget. We shall see that the optimal 
inventory policy is the same as immediately above: only the closing 

interval of no production begins earlier. Since the problem is the same 
as above when the budget constraint is not binding, we assume that 






r(t)dt - 1(0) 



> A, 



where is given by (G15) . Thus, we consider 



G22) 



min 



[au(t) + h(I(t))]dt with 



u(t) ^0 
subject to: 



and 



dt 



= u(t) - r(t). 



dT= 



0 ^ u(t) b , 



T specified, 



(G22) 



with boundary conditions 
l(t=0) = 1(0), 



M(t=0) = 0, M(t=T) = A, 



(G23) 
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where M(t) is total expenditures on production through time t. As 
before we assume (GIO) for the holding and penalty costs. 

The Hamiltonian is given by 



H(t,I,p,u) = au + h(I) + Pj^(u-r) + p^au, (G24) 



and it is seen that the optimal control on non-singular subarcs is 
given by 



* 

u (t) = 



0 for Pj^(t) > -a(l+p 2 > 



b for Pj^(t) < -aCl+p^) 



(G25) 



The adjoint equations for the dual variables are 
dPi 



dt 

dp, 

dt 



p^(t=T) = 0 



iii = dh 
91 ~ dl 

9H 

— = 0 Po(t) = const and no condition 
' on p^U-1). 



(G26) 



It is easy to see that we must have P 2 > 0. Recalling the well-known 

dJ* 

interpretation of the dual variables [12], we see that p = — . Since 

Z oM 

increasing total expenditure increases to minimum inventory cost we 
dJ* 

have — >0. We could also argue that if P 2 were negative then 
defined by (where x = T - t) 






^ dx = -aCl+p^) 



would be less than x^ defined by (G15) . Thus production would occur 
for a longer period of time, and this is impossible since we assume 
that the budget constraint is binding. 
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Other solution details are similar to the case above, and we omit 
them. The optimal inventory policy is given by 




and 

* 

u 



for I(t) > 0 

for I(t) = 0 

for I(t) < 0 for 

(t) = 0 for t €(T-i 2 ,T] , 



t e[0,T-T^] 

(G27) 



where is determined by 



a 




u (t)dt 



A, 



since we assume that (G22) holds. 

Budget Constraints - Production and Holding Costs 
We extend the above model to the case of a budget constraint on 
total production plus holding costs, i.e., we must have 



rT 

[c(u(t)) + h (I(t))]dt A, 
'’O 



where A is the total budget and 



h^(I) 



/ h(I) for 1^0 
< 0 for I < 0. 



We shall see that the optimal inventory policy is the same as immediately 
above only the closing interval of no production begins even earlier. 
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Since the solution to the problem is the same as (G21) when the constraint 
is not binding, we assume that 






{r(t) + h^(I(t))}dt - 1(0) 



> A, 



(G28) 



where is given by (G15). Thus, we consider 



min 

u(t) 



[au(t) + h(I(t))]dt with T specified, 



subject to: ^ = u(t) - r(t). 



^ = au(t) + hj^(I(t)) , 



and 0 u(t) < b , 



with boundary conditions 



l(t=0) = 1(0), 

M(t=0) = 0, M(t=T) = A. 



(G29) 



As before we assume (GIO) for the holding and penalty costs. 

The Hamiltonian is given by 

H(t,I,p,u) = uCa-hpj^+p^a) + h(I) - p^r + p^h^Cl) , (G30) 

and the optimal control on non-singular subarcs is given by (G25) . The 
adjoint equations are again given by (G26) , and again we must have 
p^ = const > 0. The rest is similar to previous isoperimetric problem 



(integral constraint) . 
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The optimal inventory policy is given again by (G27) with the 
exception that T 2 is now determined by 

T-t. 

^ ^ * 

au (t) + h (I(t)) dt = A, 

Jo ^ 

since we assume that (G28) holds. 

e . Discussion . 

In this section we review the structure of optimal inventory 
policies for the models we have considered in the previous section and 
attempt some generalizations. We also comment on the nature of deter- 
ministic inventory models. As a general comment, we note the similarity 
of these dynamic inventory models to the (one-sided) attrition games 
we have considered in previous appendices. This should alert us to the 
possibility of optimal inventory policies being dependent upon the type 
of boundary conditions specified. 

Considering the sequence of models in the previous section, we 
observe that when future demand is knoism with certainty and the produc- 
tion rate costs are concave (a special case which is linear) : 

(a) never order while you have inventory, 

(b) if shortages are allowed, then the best policy is to run 
out of inventory at the end of the planning period, 

(c) budget constraints on production and holding costs are to 
be ignored (until they become binding) . 

For convex production rate costs, the situation is more complex. Under 

certain circumstance^ it is advantageous to produce at lower rates 

before inventory is depleted than to hold off production until stocks 

are entirely depleted after which time higher production rates would 
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be required. This situation arises due to marginal production rate 
costs which are an increasing function of the production rate. We 
hope to explore this case more fully in the future. 

These models have assumed perfect knowledge of the future. What 
is the effect of uncertainty? Uncertainty may cause inventory to be 
backlogged, but we are novices in this field. We have noted previously 
in the Lanchester theory of combat that if we interpret a linear law 
attrition process as being the result of uncertainty, then we "split" 
the allocation of fire among target types as a "hedge" against uncer- 
tainty. We should also note that certain aspects of the solution 
procedure for these dynamic deterministic models extend to the stochas- 
tic case. For example, we determine the marginal costs of inventory 
backwards from the end of the planning horizon. 

We should not lose sight that these models are idealizations of 
a more complex real world process. Therefore, the structure or nature 
of optimal inventory policies and its dependence on model form is of 
prime importance. The real world is considerably more uncertain than 
the perfect knowledge of future demand assumed by these models, but 
yet there is much that we can learn from deterministic inventory theory. 
Because of their idealized and simplified nature, it is possible to 
develop "closed-form" solutions to many deterministic inventory models. 
We have done this in the current report. In such solutions the inter- 
dependence of model parameters is explicitly exhibited. This leads to 
a better understanding of the structure of trade-off decisions to be 
made. This should be contrasted to dynamic programming models (both 
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deterministic and probabilistic) for which, in most instances, a solution 
is developed only for a specific set of parameter values. In this case, 
it is difficult (if not impossible) to see the structure of optimal 
inventory policies and its dependence on model form without a parametric 
analysis of model output. 

The intimate connection between variational methods and dynamic 
programming (their dual relationship in the sense of J. Plucker’s 
principle of duality ) is well known [10], [30]. It is important to 
understand the Hamilton-Jacobi approach to variational problems. In 
discrete and stochastic cases, we formulate the analogue of the Hamil- 
ton-Jacobi-Bellman equation for the optimal return. Hence, understanding 
the principles of the solution procedure in the deterministic case pro- 
vides the insight for extensions. 



* 

Actually first stated in non-algebraic terms by J. Gergonne. 
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